Try our new research platform with insights from 80,000+ expert users

Amazon EC2 Auto Scaling vs Apache Spark comparison

 

Comparison Buyer's Guide

Executive Summary
 

Categories and Ranking

Amazon EC2 Auto Scaling
Ranking in Compute Service
2nd
Average Rating
8.8
Reviews Sentiment
8.2
Number of Reviews
43
Ranking in other categories
No ranking in other categories
Apache Spark
Ranking in Compute Service
4th
Average Rating
8.4
Reviews Sentiment
7.7
Number of Reviews
64
Ranking in other categories
Hadoop (1st), Java Frameworks (2nd)
 

Mindshare comparison

As of November 2024, in the Compute Service category, the mindshare of Amazon EC2 Auto Scaling is 12.8%, up from 10.2% compared to the previous year. The mindshare of Apache Spark is 11.2%, up from 7.7% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Compute Service
 

Featured Reviews

Poulav Biswas - PeerSpot reviewer
Well-documented setup process and highly stable solution
We have several instances and applications that we run using WordPress. For that, I needed an easy, secure, and faster solution with different options to back up the website and data. Amazon EC2 offers options to back up data using the S3 version control system, which worked really well for us…
SurjitChoudhury - PeerSpot reviewer
Offers batch processing of data and in-memory processing in Spark greatly enhances performance
Spark supports real-time data processing through Spark Streaming. It allows for batch processing of data. If you have immediate data, like chat information, that needs to be processed in real-time, Spark Streaming is used. For data that can be evaluated later, batch processing with Apache Spark is suitable. Mostly, batch processing is utilized in our organization, but for streaming data processing, tools like Kafka are often integrated. In-memory processing in Spark greatly enhances performance, making it a hundred times faster than the previous MapReduce methods. This improvement is achieved through optimization techniques like caching, broadcasting, and partitioning, which help in optimizing queries for faster processing.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"Amazon EC2 Auto Scaling has a cool-down time feature called the warmup start."
"It has the best auto-scaling features."
"The solution is highly scalable."
"The integration capabilities are good."
"Auto-scaling is a good feature."
"We use the solution to increase CPU and memory size."
"The initial setup is straightforward."
"With the ability to set up rules based on demand, network, or traffic, the service offers a necessary level of adaptability."
"The most significant advantage of Spark 3.0 is its support for DataFrame UDF Pandas UDF features."
"One of the key features is that Apache Spark is a distributed computing framework. You can help multiple slaves and distribute the workload between them."
"The product is useful for analytics."
"The product's deployment phase is easy."
"It is highly scalable, allowing you to efficiently work with extensive datasets that might be problematic to handle using traditional tools that are memory-constrained."
"The product's initial setup phase was easy."
"Spark can handle small to huge data and is suitable for any size of company."
"It is useful for handling large amounts of data. It is very useful for scientific purposes."
 

Cons

"It's an expensive solution."
"The licensing cost is expensive."
"The product's setup is complex for an intermediate user."
"The support to manage the processes could be better."
"The solution's pricing is expensive. You pay based on how much you use it, like paying for the time or hours you use the service. There's no need to buy hardware separately."
"The launch configuration feature doesn't work properly. It needs to improve the load configuration feature along with launch templates. The tool needs to tag feature as well."
"There is room for improvement. You might end up paying a high price if you're not careful and you provision a server that's underutilized."
"Sometimes the configuration is not intuitive."
"When you are working with large, complex tasks, the garbage collection process is slow and affects performance."
"Stream processing needs to be developed more in Spark. I have used Flink previously. Flink is better than Spark at stream processing."
"Apache Spark could potentially improve in terms of user-friendliness, particularly for individuals with a SQL background. While it's suitable for those with programming knowledge, making it more accessible to those without extensive programming skills could be beneficial."
"This solution currently cannot support or distribute neural network related models, or deep learning related algorithms. We would like this functionality to be developed."
"Needs to provide an internal schedule to schedule spark jobs with monitoring capability."
"It should support more programming languages."
"Apache Spark should add some resource management improvements to the algorithms."
"When you first start using this solution, it is common to run into memory errors when you are dealing with large amounts of data."
 

Pricing and Cost Advice

"It's cost-effective."
"Amazon EC2 instances can be very expensive."
"The product is expensive."
"Compared to the performance, the price is quite high. I would rate it a ten because it is expensive. There are additional costs including bandwidth costs, data transfer costs, and load balancing costs."
"The pricing is not fixed and it is based on usage."
"When we want to use more services, we need to pay more. It's a monthly subscription, rather than licensed-based. Pricing or fees for Amazon EC2 Auto Scaling could be improved."
"The product's pricing depends on the traffic and workload."
"The solution is not expensive."
"Apache Spark is not too cheap. You have to pay for hardware and Cloudera licenses. Of course, there is a solution with open source without Cloudera."
"It is an open-source solution, it is free of charge."
"It is an open-source platform. We do not pay for its subscription."
"The product is expensive, considering the setup."
"The tool is an open-source product. If you're using the open-source Apache Spark, no fees are involved at any time. Charges only come into play when using it with other services like Databricks."
"The solution is affordable and there are no additional licensing costs."
"Spark is an open-source solution, so there are no licensing costs."
"Apache Spark is an open-source solution, and there is no cost involved in deploying the solution on-premises."
report
Use our free recommendation engine to learn which Compute Service solutions are best for your needs.
816,406 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
23%
Computer Software Company
16%
University
8%
Government
6%
Financial Services Firm
27%
Computer Software Company
13%
Manufacturing Company
8%
University
5%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

What do you like most about Amazon EC2 Auto Scaling?
The solution removes the need for hardware. We can easily create servers or machines. Just by clicking or specifying our requirements, like memory size or disk space, it's set up for us. The tool e...
What is your experience regarding pricing and costs for Amazon EC2 Auto Scaling?
The pricing of Amazon EC2 Auto Scaling is minimal, varying depending on the region and resources used. The service itself is almost free; however, associated resources may incur costs.
What needs improvement with Amazon EC2 Auto Scaling?
For future improvements, I suggest focusing on cost reduction. Additionally, while the interface is already user-friendly, more updates and enhancements could be beneficial, particularly for custom...
What do you like most about Apache Spark?
We use Spark to process data from different data sources.
What is your experience regarding pricing and costs for Apache Spark?
Compared to other solutions like Doc DB, Spark is more costly due to the need for extensive infrastructure. It requires significant investment in infrastructure, which can be expensive. While cloud...
What needs improvement with Apache Spark?
The main concern is the overhead of Java when distributed processing is not necessary. In such cases, operations can often be done on one node, making Spark's distributed mode unnecessary. Conseque...
 

Also Known As

AWS RAM
No data available
 

Overview

 

Sample Customers

Expedia, Intuit, Royal Dutch Shell, Brooks Brothers
NASA JPL, UC Berkeley AMPLab, Amazon, eBay, Yahoo!, UC Santa Cruz, TripAdvisor, Taboola, Agile Lab, Art.com, Baidu, Alibaba Taobao, EURECOM, Hitachi Solutions
Find out what your peers are saying about Amazon EC2 Auto Scaling vs. Apache Spark and other solutions. Updated: October 2024.
816,406 professionals have used our research since 2012.