Try our new research platform with insights from 80,000+ expert users

Apache Spark vs IBM Db2 Big SQL comparison

 

Comparison Buyer's Guide

Executive Summary
 

Categories and Ranking

Apache Spark
Ranking in Hadoop
1st
Average Rating
8.4
Number of Reviews
64
Ranking in other categories
Compute Service (4th), Java Frameworks (2nd)
IBM Db2 Big SQL
Ranking in Hadoop
10th
Average Rating
0.0
Number of Reviews
0
Ranking in other categories
No ranking in other categories
 

Mindshare comparison

As of November 2024, in the Hadoop category, the mindshare of Apache Spark is 18.2%, down from 21.9% compared to the previous year. The mindshare of IBM Db2 Big SQL is 1.3%, down from 1.7% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Hadoop
 

Featured Reviews

SurjitChoudhury - PeerSpot reviewer
Feb 20, 2024
Offers batch processing of data and in-memory processing in Spark greatly enhances performance
Spark supports real-time data processing through Spark Streaming. It allows for batch processing of data. If you have immediate data, like chat information, that needs to be processed in real-time, Spark Streaming is used. For data that can be evaluated later, batch processing with Apache Spark is suitable. Mostly, batch processing is utilized in our organization, but for streaming data processing, tools like Kafka are often integrated. In-memory processing in Spark greatly enhances performance, making it a hundred times faster than the previous MapReduce methods. This improvement is achieved through optimization techniques like caching, broadcasting, and partitioning, which help in optimizing queries for faster processing.
Use IBM Db2 Big SQL?
Share your opinion

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pricing and Cost Advice

"Apache Spark is not too cheap. You have to pay for hardware and Cloudera licenses. Of course, there is a solution with open source without Cloudera."
"Spark is an open-source solution, so there are no licensing costs."
"It is quite expensive. In fact, it accounts for almost 50% of the cost of our entire project."
"The solution is affordable and there are no additional licensing costs."
"Licensing costs can vary. For instance, when purchasing a virtual machine, you're asked if you want to take advantage of the hybrid benefit or if you prefer the license costs to be included upfront by the cloud service provider, such as Azure. If you choose the hybrid benefit, it indicates you already possess a license for the operating system and wish to avoid additional charges for that specific VM in Azure. This approach allows for a reduction in licensing costs, charging only for the service and associated resources."
"The product is expensive, considering the setup."
"Since we are using the Apache Spark version, not the data bricks version, it is an Apache license version, the support and resolution of the bug are actually late or delayed. The Apache license is free."
"Apache Spark is an open-source solution, and there is no cost involved in deploying the solution on-premises."
Information not available
report
Use our free recommendation engine to learn which Hadoop solutions are best for your needs.
814,763 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
26%
Computer Software Company
13%
Manufacturing Company
8%
Educational Organization
5%
No data available
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
No data available
 

Questions from the Community

What do you like most about Apache Spark?
We use Spark to process data from different data sources.
What is your experience regarding pricing and costs for Apache Spark?
Compared to other solutions like Doc DB, Spark is more costly due to the need for extensive infrastructure. It requires significant investment in infrastructure, which can be expensive. While cloud...
What needs improvement with Apache Spark?
The main concern is the overhead of Java when distributed processing is not necessary. In such cases, operations can often be done on one node, making Spark's distributed mode unnecessary. Conseque...
Ask a question
Earn 20 points
 

Learn More

Video not available
 

Overview

 

Sample Customers

NASA JPL, UC Berkeley AMPLab, Amazon, eBay, Yahoo!, UC Santa Cruz, TripAdvisor, Taboola, Agile Lab, Art.com, Baidu, Alibaba Taobao, EURECOM, Hitachi Solutions
Information Not Available
Find out what your peers are saying about Apache, Cloudera, Amazon Web Services (AWS) and others in Hadoop. Updated: October 2024.
814,763 professionals have used our research since 2012.