Try our new research platform with insights from 80,000+ expert users

Amazon EC2 Auto Scaling vs Apache Spark comparison

 

Comparison Buyer's Guide

Executive Summary

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Categories and Ranking

Amazon EC2 Auto Scaling
Ranking in Compute Service
2nd
Average Rating
8.8
Reviews Sentiment
8.2
Number of Reviews
44
Ranking in other categories
No ranking in other categories
Apache Spark
Ranking in Compute Service
3rd
Average Rating
8.4
Reviews Sentiment
7.7
Number of Reviews
65
Ranking in other categories
Hadoop (1st), Java Frameworks (2nd)
 

Mindshare comparison

As of February 2025, in the Compute Service category, the mindshare of Amazon EC2 Auto Scaling is 11.2%, down from 11.4% compared to the previous year. The mindshare of Apache Spark is 11.3%, up from 8.5% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Compute Service
 

Featured Reviews

Erick  Karanja - PeerSpot reviewer
Scaling is as easy as hitting a button and setup is straightforward
AWS has already made improvements. In the past, if you provisioned a large EC2 instance and underutilized it, you still paid a premium. Now, AWS encourages using Kubernetes, where you primarily pay for the compute power you actually use in production. There is room for improvement. You might end up paying a high price if you're not careful and you provision a server that's underutilized. AWS has left it to engineers to figure out solutions. If you find the cost too high, you can move to Kubernetes, which might be a better solution for you than large EC2 instances. So, the improvements need to come from the user side, not the provider. Software engineers and engineering teams need to know their limits with EC2 instances. They need to recognize when it's time to transition their applications to Kubernetes. This means building with the cloud in mind from the start, making it easier to move solutions to the cloud without suffering upgrades and integration issues.
Ilya Afanasyev - PeerSpot reviewer
Reliable, able to expand, and handle large amounts of data well
We use batch processing. It works well with our formats and file versions. There's a lot of functionality. In our pipeline each hour, we make a copy of data from MongoDB, of the changes from MongoDB to some specific file. Each time pipeline copied all of the data, it would do it each time without changes to all of the tables. Tables have a lot of data, and in the last MongoDB version, there is a possibility to read only changed data. This reduced the cost and configuration of the cluster, and we saved about $150,000. The solution is scalable. It's a stable product.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"With the ability to set up rules based on demand, network, or traffic, the service offers a necessary level of adaptability."
"It's good performance-wise."
"The product’s most valuable feature is the seamless resizing of web connection."
"It has the best auto-scaling features."
"The feature I found most valuable was the vertical and horizontal scaling."
"The solution incorporates ease of maintenance and reduction in operational overhead and costs. Patching is also easy."
"One of the most important benefits is that a company can optimize resources because Auto Scaling deploys resources when needed. For example, for Black Friday, a company can deploy 100 servers for a couple of days. When Black Friday is over, the company can delete those servers."
"The solution is highly scalable."
"The scalability has been the most valuable aspect of the solution."
"There's a lot of functionality."
"One of the key features is that Apache Spark is a distributed computing framework. You can help multiple slaves and distribute the workload between them."
"The features we find most valuable are the machine learning, data learning, and Spark Analytics."
"It is highly scalable, allowing you to efficiently work with extensive datasets that might be problematic to handle using traditional tools that are memory-constrained."
"The good performance. The nice graphical management console. The long list of ML algorithms."
"The product’s most valuable features are lazy evaluation and workload distribution."
"It's easy to prepare parallelism in Spark, run the solution with specific parameters, and get good performance."
 

Cons

"When creating a new instance there is a set of questions that have to be answered, and this is something that can be simplified."
"The primary area for improvement is the pricing model."
"The product should improve vertical scaling features."
"The support to manage the processes could be better."
"Amazon EC2 Auto Scaling can provide more discounts when using the machines the solution uses."
"There is room for improvement in the pricing model."
"There is a need for improvement in understanding the pricing structure, as it is complex and depends on several factors such as the location of data centers."
"Amazon EC2 Auto Scaling offers various benefits but lacks certain features for fine-grained customization compared to other cloud providers like GCP. Users are constrained by predefined instance families in EC2 when selecting instance types for scaling. Unlike GCP, where users can independently scale resources such as memory or CPU, EC2 doesn't offer this flexibility."
"It's not easy to install."
"When you want to extract data from your HDFS and other sources then it is kind of tricky because you have to connect with those sources."
"Stream processing needs to be developed more in Spark. I have used Flink previously. Flink is better than Spark at stream processing."
"The migration of data between different versions could be improved."
"When you are working with large, complex tasks, the garbage collection process is slow and affects performance."
"When you first start using this solution, it is common to run into memory errors when you are dealing with large amounts of data."
"Apache Spark should add some resource management improvements to the algorithms."
"For improvement, I think the tool could make things easier for people who aren't very technical. There's a significant learning curve, and I've seen organizations give up because of it. Making it quicker or easier for non-technical people would be beneficial."
 

Pricing and Cost Advice

"The product is cheap."
"Its price is affordable for enterprise customers."
"The product is expensive."
"It's cost-effective."
"There is no specific pricing for Amazon EC2 Auto Scaling, but we have to pay for the number of machines getting scaled up."
"Pricing could be a little bit more competitive."
"Amazon EC2 instances can be very expensive."
"The price of this product could be a little bit lower."
"They provide an open-source license for the on-premise version."
"It is an open-source solution, it is free of charge."
"Apache Spark is not too cheap. You have to pay for hardware and Cloudera licenses. Of course, there is a solution with open source without Cloudera."
"Licensing costs can vary. For instance, when purchasing a virtual machine, you're asked if you want to take advantage of the hybrid benefit or if you prefer the license costs to be included upfront by the cloud service provider, such as Azure. If you choose the hybrid benefit, it indicates you already possess a license for the operating system and wish to avoid additional charges for that specific VM in Azure. This approach allows for a reduction in licensing costs, charging only for the service and associated resources."
"The product is expensive, considering the setup."
"We are using the free version of the solution."
"I did not pay anything when using the tool on cloud services, but I had to pay on the compute side. The tool is not expensive compared with the benefits it offers. I rate the price as an eight out of ten."
"Apache Spark is open-source. You have to pay only when you use any bundled product, such as Cloudera."
report
Use our free recommendation engine to learn which Compute Service solutions are best for your needs.
838,640 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
22%
Computer Software Company
16%
University
7%
Government
7%
Financial Services Firm
27%
Computer Software Company
13%
Manufacturing Company
7%
Comms Service Provider
5%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

What do you like most about Amazon EC2 Auto Scaling?
The solution removes the need for hardware. We can easily create servers or machines. Just by clicking or specifying our requirements, like memory size or disk space, it's set up for us. The tool e...
What is your experience regarding pricing and costs for Amazon EC2 Auto Scaling?
The pricing structure from AWS is really complex and depends on factors like the region and specific services used. Prices can vary significantly even within the same service across different locat...
What needs improvement with Amazon EC2 Auto Scaling?
There is a need for improvement in understanding the pricing structure, as it is complex and depends on several factors such as the location of data centers.
What do you like most about Apache Spark?
We use Spark to process data from different data sources.
What is your experience regarding pricing and costs for Apache Spark?
Compared to other solutions like Doc DB, Spark is more costly due to the need for extensive infrastructure. It requires significant investment in infrastructure, which can be expensive. While cloud...
What needs improvement with Apache Spark?
The Spark solution could improve in scheduling tasks and managing dependencies. Spark alone cannot handle sequential tasks, requiring environments like Airflow scheduler or scripts. For instance, o...
 

Also Known As

AWS RAM
No data available
 

Overview

 

Sample Customers

Expedia, Intuit, Royal Dutch Shell, Brooks Brothers
NASA JPL, UC Berkeley AMPLab, Amazon, eBay, Yahoo!, UC Santa Cruz, TripAdvisor, Taboola, Agile Lab, Art.com, Baidu, Alibaba Taobao, EURECOM, Hitachi Solutions
Find out what your peers are saying about Amazon EC2 Auto Scaling vs. Apache Spark and other solutions. Updated: January 2025.
838,640 professionals have used our research since 2012.