Try our new research platform with insights from 80,000+ expert users

Amazon EC2 vs Apache Spark comparison

 

Comparison Buyer's Guide

Executive Summary
 

Categories and Ranking

Amazon EC2
Ranking in Compute Service
5th
Average Rating
8.6
Reviews Sentiment
7.2
Number of Reviews
63
Ranking in other categories
No ranking in other categories
Apache Spark
Ranking in Compute Service
4th
Average Rating
8.4
Reviews Sentiment
7.7
Number of Reviews
64
Ranking in other categories
Hadoop (1st), Java Frameworks (2nd)
 

Mindshare comparison

As of December 2024, in the Compute Service category, the mindshare of Amazon EC2 is 6.5%, down from 6.6% compared to the previous year. The mindshare of Apache Spark is 11.1%, up from 7.8% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Compute Service
 

Featured Reviews

Julius Mboya - PeerSpot reviewer
Ensures we can provision resources on demand, and we can grow and shrink them as per the traffic
We faced a challenge in regard to billing. There was a time when we were working on changing the mode of payment from card to wire. It took a lot of time because our state is set up in Kenya, so we needed to pay in Kenya currency. We have to go around in circles. There was a lot of documentation required, but we managed to go through successfully.
SurjitChoudhury - PeerSpot reviewer
Offers batch processing of data and in-memory processing in Spark greatly enhances performance
Spark supports real-time data processing through Spark Streaming. It allows for batch processing of data. If you have immediate data, like chat information, that needs to be processed in real-time, Spark Streaming is used. For data that can be evaluated later, batch processing with Apache Spark is suitable. Mostly, batch processing is utilized in our organization, but for streaming data processing, tools like Kafka are often integrated. In-memory processing in Spark greatly enhances performance, making it a hundred times faster than the previous MapReduce methods. This improvement is achieved through optimization techniques like caching, broadcasting, and partitioning, which help in optimizing queries for faster processing.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"The initial setup is straightforward."
"Configuration can be changed at any time and it's very scalable."
"The product helps us with scalability. We also need not have data centers."
"The most important aspects are that the solution is scalable and easy to manage."
"The ability to quickly spin up instances on demand with zero upfront costs or infrastructure is the most valuable for me."
"I like the AMI-related features. A very good feature of this solution is the customizable AMI. It is a very good feature provided by Amazon. The encryption technologies are also very good. We are using KMS, etc."
"We don't have to worry about scalability issues or maintenance or security. It's all taken care of."
"The greatest benefit of Amazon EC2 is its versatility, including the diverse range of servers available and the ability to connect to various resources."
"The product’s most valuable features are lazy evaluation and workload distribution."
"The good performance. The nice graphical management console. The long list of ML algorithms."
"Provides a lot of good documentation compared to other solutions."
"With Spark, we parallelize our operations, efficiently accessing both historical and real-time data."
"The features we find most valuable are the machine learning, data learning, and Spark Analytics."
"The solution has been very stable."
"ETL and streaming capabilities."
"The memory processing engine is the solution's most valuable aspect. It processes everything extremely fast, and it's in the cluster itself. It acts as a memory engine and is very effective in processing data correctly."
 

Cons

"One of the challenges is the AMI upgrades."
"Amazon EC2 is very expensive, and it would be helpful if they decreased the pricing."
"If the solution was cheaper, if the price was less, it would be better."
"Regarding availability, a noticeable improvement would be the possibility of more load balancing configurations and the deployment of more datacenters, mainly in Latin America."
"Amazon EC2 could improve the stability."
"I would like to see improvement in the information available up-front for users around tailoring the package to their actual requirements. At present it can take time to work with the on demand instance until you are used to what features are right for the user."
"The initial setup could be easier because many keys are required for access."
"Currently in the autoscaling process if we have multiple issues we are not able to connect some of the VPC through the SMS."
"One limitation is that not all machine learning libraries and models support it."
"The solution needs to optimize shuffling between workers."
"More ML based algorithms should be added to it, to make it algorithmic-rich for developers."
"The initial setup was not easy."
"The product could improve the user interface and make it easier for new users."
"Stability in terms of API (things were difficult, when transitioning from RDD to DataFrames, then to DataSet)."
"It needs a new interface and a better way to get some data. In terms of writing our scripts, some processes could be faster."
"It would be beneficial to enhance Spark's capabilities by incorporating models that utilize features not traditionally present in its framework."
 

Pricing and Cost Advice

"There is a license required to use this solution and we pay on a monthly basis."
"It is not an expensive solution."
"The clients have found the billing of Amazon EC2 good, but the price could be less high. There is a monthly subscription to use the solution."
"EC2 pricing is somewhat transparent, in that AWS provides pricing for all instance types. However, the number of pricing options can be confusing."
"The price of Amazon EC2 could improve. The Google Cloud Platform is more cost-effective."
"It has helped to reduce costs with infrastructure."
"Reducing the price of the solution could lead to an improvement."
"The pricing of this solution is variable. There is an open-source variant that is accessible via the public cloud, and then tiers that range in price depending on the level and amount of usage that is required."
"Spark is an open-source solution, so there are no licensing costs."
"They provide an open-source license for the on-premise version."
"It is an open-source platform. We do not pay for its subscription."
"The solution is affordable and there are no additional licensing costs."
"It is quite expensive. In fact, it accounts for almost 50% of the cost of our entire project."
"Considering the product version used in my company, I feel that the tool is not costly since the product is available for free."
"I did not pay anything when using the tool on cloud services, but I had to pay on the compute side. The tool is not expensive compared with the benefits it offers. I rate the price as an eight out of ten."
"We are using the free version of the solution."
report
Use our free recommendation engine to learn which Compute Service solutions are best for your needs.
824,053 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Computer Software Company
19%
Financial Services Firm
16%
Retailer
7%
Manufacturing Company
6%
Financial Services Firm
27%
Computer Software Company
13%
Manufacturing Company
8%
Retailer
5%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

What do you like most about Amazon EC2?
The scalability and elasticity are helpful.
What is your experience regarding pricing and costs for Amazon EC2?
We are paying about $1,500 a month for one of the services. I'd rate the pricing as six out of ten for being expensive.
What needs improvement with Amazon EC2?
The pricing model could be improved. We found Amazon EC2 to be pricey.
What do you like most about Apache Spark?
We use Spark to process data from different data sources.
What is your experience regarding pricing and costs for Apache Spark?
Compared to other solutions like Doc DB, Spark is more costly due to the need for extensive infrastructure. It requires significant investment in infrastructure, which can be expensive. While cloud...
What needs improvement with Apache Spark?
The main concern is the overhead of Java when distributed processing is not necessary. In such cases, operations can often be done on one node, making Spark's distributed mode unnecessary. Conseque...
 

Comparisons

 

Also Known As

Amazon Elastic Compute Cloud, EC2
No data available
 

Overview

 

Sample Customers

Netflix, Expedia, TimeInc., Novaris, airbnb, Lamborghini
NASA JPL, UC Berkeley AMPLab, Amazon, eBay, Yahoo!, UC Santa Cruz, TripAdvisor, Taboola, Agile Lab, Art.com, Baidu, Alibaba Taobao, EURECOM, Hitachi Solutions
Find out what your peers are saying about Amazon EC2 vs. Apache Spark and other solutions. Updated: December 2024.
824,053 professionals have used our research since 2012.