Try our new research platform with insights from 80,000+ expert users

Amazon EC2 vs Apache Spark comparison

 

Comparison Buyer's Guide

Executive Summary

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Categories and Ranking

Amazon EC2
Ranking in Compute Service
5th
Average Rating
8.6
Reviews Sentiment
7.2
Number of Reviews
65
Ranking in other categories
No ranking in other categories
Apache Spark
Ranking in Compute Service
3rd
Average Rating
8.4
Reviews Sentiment
7.7
Number of Reviews
65
Ranking in other categories
Hadoop (1st), Java Frameworks (2nd)
 

Mindshare comparison

As of February 2025, in the Compute Service category, the mindshare of Amazon EC2 is 6.3%, down from 6.8% compared to the previous year. The mindshare of Apache Spark is 11.3%, up from 8.5% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Compute Service
 

Featured Reviews

Prapoj Chipat - PeerSpot reviewer
You only pay for what you use, and you can scale up or scale down without any issues; has good stability and connectivity
My company uses AWS products, including cloud services. Amazon EC2 is being used within the company as well. My company has about ten thousand employees that use Amazon EC2 because everybody connects to the internet. For example, employees use the mail service from the console and Outlook service on the cloud. Everyone uses the internet within the company, so that means Amazon EC2 is being used as well, especially in this COVID situation. Right now, employees use more applications such as streaming and video services which require more connectivity and cloud capacity, so there's a plan to increase usage for Amazon EC2. In terms of how much staff is required for the deployment and maintenance of Amazon EC2, I'm not quite sure how much technical staff my company has because some of the technical staff was outsourced. There's no permanent IT staff in the office. I'm recommending Amazon EC2 to other people because it's a critical product that has more reliability. It has good quality, and it also has many features, so if the features support your use cases, then Amazon EC2 would be a good product for you. My rating for Amazon EC2 is nine out of ten.
Ilya Afanasyev - PeerSpot reviewer
Reliable, able to expand, and handle large amounts of data well
We use batch processing. It works well with our formats and file versions. There's a lot of functionality. In our pipeline each hour, we make a copy of data from MongoDB, of the changes from MongoDB to some specific file. Each time pipeline copied all of the data, it would do it each time without changes to all of the tables. Tables have a lot of data, and in the last MongoDB version, there is a possibility to read only changed data. This reduced the cost and configuration of the cluster, and we saved about $150,000. The solution is scalable. It's a stable product.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"Stable, scalable, and simple to implement."
"The most valuable feature of Amazon EC2 is the virtual machines that are available."
"The most valuable feature is autoscaling."
"Amazon EC2's most valuable feature I have heard from clients is easy provisioning. Provisioning is very quick and easy."
"There's a lot of encryption across the setups to ensure that database credentials and everything related to security are well managed."
"The greatest benefit of Amazon EC2 is its versatility, including the diverse range of servers available and the ability to connect to various resources."
"I believe that cloud solutions are better than physical servers."
"EC2 instances on AWS allow users to run virtual servers based on the CPU, memory, storage, and networking capacity."
"I like that it can handle multiple tasks parallelly. I also like the automation feature. JavaScript also helps with the parallel streaming of the library."
"The features we find most valuable are the machine learning, data learning, and Spark Analytics."
"We use Spark to process data from different data sources."
"Apache Spark can do large volume interactive data analysis."
"There's a lot of functionality."
"The most valuable feature of Apache Spark is its memory processing because it processes data over RAM rather than disk, which is much more efficient and fast."
"The product’s most valuable feature is the SQL tool. It enables us to create a database and publish it."
"With Spark, we parallelize our operations, efficiently accessing both historical and real-time data."
 

Cons

"Built-in and/or integration with other services to proactively identify potential failures before they occur."
"It's not the best of the best because we still have issues with downtime. We still have issues with the cost of storage, with all these different instance styles, and how much it costs. They cost an arm and a leg the higher you go."
"Regarding availability, a noticeable improvement would be the possibility of more load balancing configurations and the deployment of more datacenters, mainly in Latin America."
"They can build automatic features for ENSS or network drive. They have the Control-M feature. Similarly, they should have a feature for the network drive that can be mapped. I have not seen such a feature. They have a lot of products but those are quite costly. There is no cheaper option available for the EC2 instance for syncing two drives. If these features are available, it would be good."
"Its price can be reduced."
"The initial setup could be easier because many keys are required for access."
"The GUI used to deploy EC2 must be improved."
"There should be enhanced accessibility from any standpoint. The accessibility should be increased, particularly in scenarios where accessing the software on the Azure platform from the cloud can be complex. Simplifying this process would be beneficial. There are too many steps involved."
"Apart from the restrictions that come with its in-memory implementation. It has been improved significantly up to version 3.0, which is currently in use."
"When you first start using this solution, it is common to run into memory errors when you are dealing with large amounts of data."
"Its UI can be better. Maintaining the history server is a little cumbersome, and it should be improved. I had issues while looking at the historical tags, which sometimes created problems. You have to separately create a history server and run it. Such things can be made easier. Instead of separately installing the history server, it can be made a part of the whole setup so that whenever you set it up, it becomes available."
"I would like to see integration with data science platforms to optimize the processing capability for these tasks."
"I know there is always discussion about which language to write applications in and some people do love Scala. However, I don't like it."
"Apache Spark's GUI and scalability could be improved."
"Needs to provide an internal schedule to schedule spark jobs with monitoring capability."
"It needs a new interface and a better way to get some data. In terms of writing our scripts, some processes could be faster."
 

Pricing and Cost Advice

"I use the free tier, although I have paid for some services that are not free. The overall cost of this solution depends on the services you use."
"We are using a pay-as-you-go model."
"Amazon EC2 has a pay-as-you-use cost model."
"When we did the deployment of Amazon EC2 we found it to be less expensive than other solutions."
"It is not an expensive solution."
"The solution has different pricing models, and its cost differs when you purchase it for one year or three years."
"It's expensive, and it could be cheaper."
"The price of Amazon EC2 could improve. The Google Cloud Platform is more cost-effective."
"Apache Spark is an expensive solution."
"Considering the product version used in my company, I feel that the tool is not costly since the product is available for free."
"They provide an open-source license for the on-premise version."
"On the cloud model can be expensive as it requires substantial resources for implementation, covering on-premises hardware, memory, and licensing."
"Apache Spark is an open-source solution, and there is no cost involved in deploying the solution on-premises."
"Spark is an open-source solution, so there are no licensing costs."
"Apache Spark is an open-source tool."
"Since we are using the Apache Spark version, not the data bricks version, it is an Apache license version, the support and resolution of the bug are actually late or delayed. The Apache license is free."
report
Use our free recommendation engine to learn which Compute Service solutions are best for your needs.
832,138 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Computer Software Company
19%
Financial Services Firm
15%
Retailer
8%
Manufacturing Company
7%
Financial Services Firm
27%
Computer Software Company
13%
Manufacturing Company
7%
Retailer
5%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

What do you like most about Amazon EC2?
The scalability and elasticity are helpful.
What is your experience regarding pricing and costs for Amazon EC2?
The pricing of EC2 can vary depending on workloads. It can be expensive for high workloads but more cost-effective for smaller applications. AWS provides various services at competitive pricing, ma...
What needs improvement with Amazon EC2?
An area needing improvement is the time limitations when accessing EC2 instances. When accessing the server, sometimes PeerSpot can fail, making it difficult to access multiple servers simultaneous...
What do you like most about Apache Spark?
We use Spark to process data from different data sources.
What is your experience regarding pricing and costs for Apache Spark?
Compared to other solutions like Doc DB, Spark is more costly due to the need for extensive infrastructure. It requires significant investment in infrastructure, which can be expensive. While cloud...
What needs improvement with Apache Spark?
The main concern is the overhead of Java when distributed processing is not necessary. In such cases, operations can often be done on one node, making Spark's distributed mode unnecessary. Conseque...
 

Comparisons

 

Also Known As

Amazon Elastic Compute Cloud, EC2
No data available
 

Overview

 

Sample Customers

Netflix, Expedia, TimeInc., Novaris, airbnb, Lamborghini
NASA JPL, UC Berkeley AMPLab, Amazon, eBay, Yahoo!, UC Santa Cruz, TripAdvisor, Taboola, Agile Lab, Art.com, Baidu, Alibaba Taobao, EURECOM, Hitachi Solutions
Find out what your peers are saying about Amazon EC2 vs. Apache Spark and other solutions. Updated: January 2025.
832,138 professionals have used our research since 2012.