Try our new research platform with insights from 80,000+ expert users

Amazon EC2 vs Apache Spark comparison

 

Comparison Buyer's Guide

Executive Summary
 

Categories and Ranking

Amazon EC2
Ranking in Compute Service
5th
Average Rating
8.6
Reviews Sentiment
7.2
Number of Reviews
63
Ranking in other categories
No ranking in other categories
Apache Spark
Ranking in Compute Service
4th
Average Rating
8.4
Reviews Sentiment
7.7
Number of Reviews
64
Ranking in other categories
Hadoop (1st), Java Frameworks (2nd)
 

Mindshare comparison

As of November 2024, in the Compute Service category, the mindshare of Amazon EC2 is 6.8%, up from 6.6% compared to the previous year. The mindshare of Apache Spark is 11.2%, up from 7.7% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Compute Service
 

Featured Reviews

Julius Mboya - PeerSpot reviewer
Ensures we can provision resources on demand, and we can grow and shrink them as per the traffic
We faced a challenge in regard to billing. There was a time when we were working on changing the mode of payment from card to wire. It took a lot of time because our state is set up in Kenya, so we needed to pay in Kenya currency. We have to go around in circles. There was a lot of documentation required, but we managed to go through successfully.
SurjitChoudhury - PeerSpot reviewer
Offers batch processing of data and in-memory processing in Spark greatly enhances performance
Spark supports real-time data processing through Spark Streaming. It allows for batch processing of data. If you have immediate data, like chat information, that needs to be processed in real-time, Spark Streaming is used. For data that can be evaluated later, batch processing with Apache Spark is suitable. Mostly, batch processing is utilized in our organization, but for streaming data processing, tools like Kafka are often integrated. In-memory processing in Spark greatly enhances performance, making it a hundred times faster than the previous MapReduce methods. This improvement is achieved through optimization techniques like caching, broadcasting, and partitioning, which help in optimizing queries for faster processing.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"Stable, scalable, and simple to implement."
"My favorite feature of this solution is the flexibility of instance types, which allows for the cost to be tailored to the usage amount and type."
"The platform has been quite stable and reliable."
"The flexibility of the security features is what is interesting."
"The amount of bandwidth has been most valuable."
"The ability to bring up servers and then do the computation and deposit means we don't have to maintain a data center. Everything is virtual and the security is also taken care of. It helps us to achieve compliance. Being a small startup with the security features that AWS provides helps us with compliance."
"What we have found most valuable is that we have not lost stability in the program."
"Amazon EC2 allows us to create different regions and availability zones based upon application needs."
"There's a lot of functionality."
"I feel the streaming is its best feature."
"DataFrame: Spark SQL gives the leverage to create applications more easily and with less coding effort."
"The product's deployment phase is easy."
"AI libraries are the most valuable. They provide extensibility and usability. Spark has a lot of connectors, which is a very important and useful feature for AI. You need to connect a lot of points for AI, and you have to get data from those systems. Connectors are very wide in Spark. With a Spark cluster, you can get fast results, especially for AI."
"The scalability has been the most valuable aspect of the solution."
"The product’s most valuable features are lazy evaluation and workload distribution."
"Its scalability and speed are very valuable. You can scale it a lot. It is a great technology for big data. It is definitely better than a lot of earlier warehouse or pipeline solutions, such as Informatica. Spark SQL is very compliant with normal SQL that we have been using over the years. This makes it easy to code in Spark. It is just like using normal SQL. You can use the APIs of Spark or you can directly write SQL code and run it. This is something that I feel is useful in Spark."
 

Cons

"I think the whole AWS stack is very disconnected from each other. in the .NET space, everything just works nicely together. In the AWS stack, there is a lot of head scratching."
"Amazon EC2 could improve by reducing the price."
"I would like to see as much automation for data validation as possible in the future."
"Pricing-wise, it is a bit high."
"Amazon EC2 could improve by having integration with other cloud systems, such as Azure, and Google Cloud which would be good. Additionally, having integration with on-premise systems would be appreciated."
"In terms of improvement, they could build some client-side desktop tools that provide easier connectivity to Amazon."
"The price could be better, and it could be more affordable. Because I run my own servers, the prices are quite high."
"Nothing is really missing in terms of features."
"More ML based algorithms should be added to it, to make it algorithmic-rich for developers."
"Apart from the restrictions that come with its in-memory implementation. It has been improved significantly up to version 3.0, which is currently in use."
"There could be enhancements in optimization techniques, as there are some limitations in this area that could be addressed to further refine Spark's performance."
"The management tools could use improvement. Some of the debugging tools need some work as well. They need to be more descriptive."
"We are building our own queries on Spark, and it can be improved in terms of query handling."
"Stream processing needs to be developed more in Spark. I have used Flink previously. Flink is better than Spark at stream processing."
"Dynamic DataFrame options are not yet available."
"Apache Spark lacks geospatial data."
 

Pricing and Cost Advice

"It's expensive, and it could be cheaper."
"The clients have found the billing of Amazon EC2 good, but the price could be less high. There is a monthly subscription to use the solution."
"I use the free tier, although I have paid for some services that are not free. The overall cost of this solution depends on the services you use."
"The licensing of Amazon EC2 is expensive. Microsoft Windows Servers are expensive to license."
"It is not an expensive solution."
"It's competitive but can vary based on instance types and usage patterns."
"The solution has different pricing models, and its cost differs when you purchase it for one year or three years."
"The price is reasonable, but there is definitely an opportunity to lower it in instances which are of a higher configuration, because they have been typically used for the long term."
"It is an open-source platform. We do not pay for its subscription."
"They provide an open-source license for the on-premise version."
"I did not pay anything when using the tool on cloud services, but I had to pay on the compute side. The tool is not expensive compared with the benefits it offers. I rate the price as an eight out of ten."
"We are using the free version of the solution."
"Since we are using the Apache Spark version, not the data bricks version, it is an Apache license version, the support and resolution of the bug are actually late or delayed. The Apache license is free."
"On the cloud model can be expensive as it requires substantial resources for implementation, covering on-premises hardware, memory, and licensing."
"Spark is an open-source solution, so there are no licensing costs."
"Licensing costs can vary. For instance, when purchasing a virtual machine, you're asked if you want to take advantage of the hybrid benefit or if you prefer the license costs to be included upfront by the cloud service provider, such as Azure. If you choose the hybrid benefit, it indicates you already possess a license for the operating system and wish to avoid additional charges for that specific VM in Azure. This approach allows for a reduction in licensing costs, charging only for the service and associated resources."
report
Use our free recommendation engine to learn which Compute Service solutions are best for your needs.
816,406 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
23%
Computer Software Company
18%
Retailer
7%
Manufacturing Company
6%
Financial Services Firm
27%
Computer Software Company
13%
Manufacturing Company
8%
University
5%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

What do you like most about Amazon EC2?
The scalability and elasticity are helpful.
What is your experience regarding pricing and costs for Amazon EC2?
We are paying about $1,500 a month for one of the services. I'd rate the pricing as six out of ten for being expensive.
What needs improvement with Amazon EC2?
The pricing model could be improved. We found Amazon EC2 to be pricey.
What do you like most about Apache Spark?
We use Spark to process data from different data sources.
What is your experience regarding pricing and costs for Apache Spark?
Compared to other solutions like Doc DB, Spark is more costly due to the need for extensive infrastructure. It requires significant investment in infrastructure, which can be expensive. While cloud...
What needs improvement with Apache Spark?
The main concern is the overhead of Java when distributed processing is not necessary. In such cases, operations can often be done on one node, making Spark's distributed mode unnecessary. Conseque...
 

Comparisons

 

Also Known As

Amazon Elastic Compute Cloud, EC2
No data available
 

Overview

 

Sample Customers

Netflix, Expedia, TimeInc., Novaris, airbnb, Lamborghini
NASA JPL, UC Berkeley AMPLab, Amazon, eBay, Yahoo!, UC Santa Cruz, TripAdvisor, Taboola, Agile Lab, Art.com, Baidu, Alibaba Taobao, EURECOM, Hitachi Solutions
Find out what your peers are saying about Amazon EC2 vs. Apache Spark and other solutions. Updated: October 2024.
816,406 professionals have used our research since 2012.