Apache Spark vs Informatica Big Data Parser comparison

Apache and Informatica are both solutions in the Hadoop category. Apache is ranked #1 with an average rating of 8.6, while Informatica is ranked #9. Apache holds a 17.5% mindshare in H, compared to Informatica’s 2.2% mindshare. Additionally, 90% of Apache users are willing to recommend the solution.

Apache Spark

Read 65 Apache Spark reviews

1,404 Views
1,090 Comparison Views

90% willing to recommend

Informatica Big Data Parser

165 Views
130 Comparison Views

Apache Spark

Informatica Big Data Parser

Comparison Buyer's Guide

Download the report

Executive Summary

We performed a comparison between Apache Spark and Informatica Big Data Parser based on real PeerSpot user reviews.

Find out what your peers are saying about Apache, Cloudera, Amazon Web Services (AWS) and others in Hadoop.

To learn more, read our detailed Hadoop Report (Updated: March 2025).

Buyer's Guide

Hadoop

March 2025

Download the complete report

Helped 848,716 peers since 2012

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:

Categories and Ranking

Apache Spark

Ranking in Hadoop

1st

Average Rating

8.4

Reviews Sentiment

7.7

Number of Reviews

Ranking in other categories

Compute Service (4th), Java Frameworks (2nd)

Informatica Big Data Parser

Ranking in Hadoop

9th

Average Rating

0.0

Number of Reviews

Ranking in other categories

No ranking in other categories

Mindshare comparison

As of April 2025, in the Hadoop category, the mindshare of Apache Spark is 17.5%, down from 21.4% compared to the previous year. The mindshare of Informatica Big Data Parser is 2.2%, down from 2.5% compared to the previous year. It is calculated based on PeerSpot user engagement data.

Hadoop

Featured Reviews

Ilya Afanasyev

Senior Software Development Engineer at Yahoo!

Reliable, able to expand, and handle large amounts of data well

We use batch processing. It works well with our formats and file versions. There's a lot of functionality. In our pipeline each hour, we make a copy of data from MongoDB, of the changes from MongoDB to some specific file. Each time pipeline copied all of the data, it would do it each time without changes to all of the tables. Tables have a lot of data, and in the last MongoDB version, there is a possibility to read only changed data. This reduced the cost and configuration of the cluster, and we saved about $150,000. The solution is scalable. It's a stable product.

Read full review

Use Informatica Big Data Parser?

Share your opinion

See which vendors are best for you

Use our free recommendation engine to learn which Hadoop solutions are best for your needs.

See recommendations

848,716 professionals have used our research since 2012.

Top Industries

By visitors reading reviews

Financial Services Firm

27%

Computer Software Company

13%

Manufacturing Company

Comms Service Provider

No data available

Company Size

By reviewers

Large Enterprise

Midsize Enterprise

Small Business

No data available

Questions from the Community

What do you like most about Apache Spark?

We use Spark to process data from different data sources.

See all answers

What is your experience regarding pricing and costs for Apache Spark?

Compared to other solutions like Doc DB, Spark is more costly due to the need for extensive infrastructure. It requires significant investment in infrastructure, which can be expensive. While cloud...

See all answers

What needs improvement with Apache Spark?

The Spark solution could improve in scheduling tasks and managing dependencies. Spark alone cannot handle sequential tasks, requiring environments like Airflow scheduler or scripts. For instance, o...

See all answers

Ask a question

Earn 20 points

Comparisons

Spring Boot vs Apache Spark

Compared 27% of the time

SAP HANA vs Apache Spark

Compared 12% of the time

AWS Batch vs Apache Spark

Compared 11% of the time

Cloudera Distribution for Hadoop vs Apache Spark

Compared 7% of the time

Spark SQL vs Apache Spark

Compared 7% of the time

More Apache Spark Competitors

HPE Ezmeral Data Fabric vs Informatica Big Data Parser

Compared 39% of the time

More Informatica Big Data Parser Competitors

Product Reports

Buyer's Guide

Apache Spark

April 2025

Download Apache Spark product report

Buyer's Guide

Hadoop

March 2025

Download Informatica Big Data Parser product report

Also Known As

No data available

Big Data Parser

Overview

Spark provides programmers with an application programming interface centered on a data structure called the resilient distributed dataset (RDD), a read-only multiset of data items distributed over a cluster of machines, that is maintained in a fault-tolerant way. It was developed in response to limitations in the MapReduce cluster computing paradigm, which forces a particular linear dataflowstructure on distributed programs: MapReduce programs read input data from disk, map a function across the data, reduce the results of the map, and store reduction results on disk. Spark's RDDs function as a working set for distributed programs that offers a (deliberately) restricted form of distributed shared memory

Apache

Informatica Big Data Parser enables access to the most difficult data and file formats in Hadoop, reducing the time and cost of developing data handlers by 70 percent. It enables IT organizations to efficiently manage industry standards, binary documents, and hierarchical data.

Big Data Parser provides a unique development environment for lean data integration. With this software, your IT organization can view data samples within Big Data Parser Studio and understand their structure and layout through a set of integrated tools

Informatica

Sample Customers

NASA JPL, UC Berkeley AMPLab, Amazon, eBay, Yahoo!, UC Santa Cruz, TripAdvisor, Taboola, Agile Lab, Art.com, Baidu, Alibaba Taobao, EURECOM, Hitachi Solutions

Western Union, UPMC, BNY Mellon

Find out what your peers are saying about Apache, Cloudera, Amazon Web Services (AWS) and others in Hadoop. Updated: March 2025.

DOWNLOAD NOW

848,716 professionals have used our research since 2012.

See our list of best Hadoop vendors.

We monitor all Hadoop reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.