Apache Spark vs IBM Db2 Big SQL comparison

Apache and IBM are both solutions in the Hadoop category. Apache is ranked #1 with an average rating of 8.6, while IBM is ranked #11. Apache holds a 17.5% mindshare in H, compared to IBM’s 1.2% mindshare. Additionally, 90% of Apache users are willing to recommend the solution, compared to 100% of IBM users who would recommend it.

Apache Spark

Read 65 Apache Spark reviews

1,404 Views
1,090 Comparison Views

90% willing to recommend

IBM Db2 Big SQL

156 Views
81 Comparison Views

Apache Spark

IBM Db2 Big SQL

Comparison Buyer's Guide

Download the report

Executive Summary

We performed a comparison between Apache Spark and IBM Db2 Big SQL based on real PeerSpot user reviews.

Find out what your peers are saying about Apache, Cloudera, Amazon Web Services (AWS) and others in Hadoop.

To learn more, read our detailed Hadoop Report (Updated: March 2025).

Buyer's Guide

Hadoop

March 2025

Download the complete report

Helped 847,646 peers since 2012

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:

Categories and Ranking

Apache Spark

Ranking in Hadoop

1st

Average Rating

8.4

Reviews Sentiment

7.7

Number of Reviews

Ranking in other categories

Compute Service (4th), Java Frameworks (2nd)

IBM Db2 Big SQL

Ranking in Hadoop

11th

Average Rating

0.0

Number of Reviews

Ranking in other categories

No ranking in other categories

Mindshare comparison

As of April 2025, in the Hadoop category, the mindshare of Apache Spark is 17.5%, down from 21.4% compared to the previous year. The mindshare of IBM Db2 Big SQL is 1.2%, down from 1.5% compared to the previous year. It is calculated based on PeerSpot user engagement data.

Hadoop

Featured Reviews

Ilya Afanasyev

Senior Software Development Engineer at Yahoo!

Reliable, able to expand, and handle large amounts of data well

We use batch processing. It works well with our formats and file versions. There's a lot of functionality. In our pipeline each hour, we make a copy of data from MongoDB, of the changes from MongoDB to some specific file. Each time pipeline copied all of the data, it would do it each time without changes to all of the tables. Tables have a lot of data, and in the last MongoDB version, there is a possibility to read only changed data. This reduced the cost and configuration of the cluster, and we saved about $150,000. The solution is scalable. It's a stable product.

Read full review

Use IBM Db2 Big SQL?

Share your opinion

See which vendors are best for you

Use our free recommendation engine to learn which Hadoop solutions are best for your needs.

See recommendations

847,646 professionals have used our research since 2012.

Top Industries

By visitors reading reviews

Financial Services Firm

27%

Computer Software Company

13%

Manufacturing Company

Comms Service Provider

No data available

Company Size

By reviewers

Large Enterprise

Midsize Enterprise

Small Business

No data available

Questions from the Community

What do you like most about Apache Spark?

We use Spark to process data from different data sources.

See all answers

What is your experience regarding pricing and costs for Apache Spark?

Compared to other solutions like Doc DB, Spark is more costly due to the need for extensive infrastructure. It requires significant investment in infrastructure, which can be expensive. While cloud...

See all answers

What needs improvement with Apache Spark?

The Spark solution could improve in scheduling tasks and managing dependencies. Spark alone cannot handle sequential tasks, requiring environments like Airflow scheduler or scripts. For instance, o...

See all answers

Ask a question

Earn 20 points

Comparisons

Spring Boot vs Apache Spark

Compared 27% of the time

SAP HANA vs Apache Spark

Compared 12% of the time

AWS Batch vs Apache Spark

Compared 11% of the time

Cloudera Distribution for Hadoop vs Apache Spark

Compared 7% of the time

Spark SQL vs Apache Spark

Compared 7% of the time

More Apache Spark Competitors

Spark SQL vs IBM Db2 Big SQL

Compared 100% of the time

More IBM Db2 Big SQL Competitors

Product Reports

Buyer's Guide

Apache Spark

April 2025

Download Apache Spark product report

Buyer's Guide

Hadoop

March 2025

Download IBM Db2 Big SQL product report

Overview

Spark provides programmers with an application programming interface centered on a data structure called the resilient distributed dataset (RDD), a read-only multiset of data items distributed over a cluster of machines, that is maintained in a fault-tolerant way. It was developed in response to limitations in the MapReduce cluster computing paradigm, which forces a particular linear dataflowstructure on distributed programs: MapReduce programs read input data from disk, map a function across the data, reduce the results of the map, and store reduction results on disk. Spark's RDDs function as a working set for distributed programs that offers a (deliberately) restricted form of distributed shared memory

Apache

IBM Db2® Big SQL is an enterprise-grade, hybrid ANSI-compliant SQL-on-Hadoop engine, delivering massively parallel processing (MPP) and advanced data query. Db2 Big SQL offers a single database connection or query for disparate sources such as Hadoop HDFS and WebHDFS, RDMS, NoSQL databases and object stores. Benefit from low latency, high performance, security, SQL compatibility and federation capabilities to do ad hoc and complex queries.

IBM

Sample Customers

NASA JPL, UC Berkeley AMPLab, Amazon, eBay, Yahoo!, UC Santa Cruz, TripAdvisor, Taboola, Agile Lab, Art.com, Baidu, Alibaba Taobao, EURECOM, Hitachi Solutions

Information Not Available

Find out what your peers are saying about Apache, Cloudera, Amazon Web Services (AWS) and others in Hadoop. Updated: March 2025.

DOWNLOAD NOW

847,646 professionals have used our research since 2012.

See our list of best Hadoop vendors.

We monitor all Hadoop reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.