Try our new research platform with insights from 80,000+ expert users

AWS Lambda vs Apache Spark comparison

 

Comparison Buyer's Guide

Executive Summary

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Categories and Ranking

Apache Spark
Ranking in Compute Service
3rd
Average Rating
8.4
Reviews Sentiment
7.7
Number of Reviews
65
Ranking in other categories
Hadoop (1st), Java Frameworks (2nd)
AWS Lambda
Ranking in Compute Service
1st
Average Rating
8.4
Reviews Sentiment
7.5
Number of Reviews
83
Ranking in other categories
No ranking in other categories
 

Mindshare comparison

As of February 2025, in the Compute Service category, the mindshare of Apache Spark is 11.3%, up from 8.5% compared to the previous year. The mindshare of AWS Lambda is 20.8%, down from 26.3% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Compute Service
 

Featured Reviews

Ilya Afanasyev - PeerSpot reviewer
Reliable, able to expand, and handle large amounts of data well
We use batch processing. It works well with our formats and file versions. There's a lot of functionality. In our pipeline each hour, we make a copy of data from MongoDB, of the changes from MongoDB to some specific file. Each time pipeline copied all of the data, it would do it each time without changes to all of the tables. Tables have a lot of data, and in the last MongoDB version, there is a possibility to read only changed data. This reduced the cost and configuration of the cluster, and we saved about $150,000. The solution is scalable. It's a stable product.
Wai L Lin O - PeerSpot reviewer
A serverless solution with easy integration features
We use AWS Lambda because it provides a solution for our needs without requiring us to manage our infrastructure. With the tool, we only pay for the resources we use. Additionally, it is straightforward to implement and integrates with other services like API Gateway. The tool's serverless nature has had the most significant impact on our workflow. I find it particularly attractive because it eliminates the need for managing servers. In my previous experience, managing upgrades and updates was quite challenging. The solution's integration process with other AWS services was relatively easy. We primarily use AWS services such as EventBridge for scheduling processes and log management.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"Apache Spark is known for its ease of use. Compared to other available data processing frameworks, it is user-friendly."
"The good performance. The nice graphical management console. The long list of ML algorithms."
"The features we find most valuable are the machine learning, data learning, and Spark Analytics."
"The product's initial setup phase was easy."
"With Hadoop-related technologies, we can distribute the workload with multiple commodity hardware."
"The processing time is very much improved over the data warehouse solution that we were using."
"Now, when we're tackling sentiment analysis using NLP technologies, we deal with unstructured data—customer chats, feedback on promotions or demos, and even media like images, audio, and video files. For processing such data, we rely on PySpark. Beneath the surface, Spark functions as a compute engine with in-memory processing capabilities, enhancing performance through features like broadcasting and caching. It's become a crucial tool, widely adopted by 90% of companies for a decade or more."
"There's a lot of functionality."
"You can spin up anything instantly without any investment."
"It enables the launch of thousands of instances simultaneously,"
"The most valuable feature of AWS Lambda, from a conceptual point, is its functions. For example, it's mathematical templates into which you can write, and create your solution. You write small pieces of a solution under given parameters."
"We moved our users into the Amazon Cognito pool, so it helps us to standardize our security practices, approaches, etc. We can customize Lambda for authentication to integrate it with API Gateway and other services."
"The cool thing about AWS Lambda is that AWS does all the management. For compression, it is all about making the data small and then making it regular size again. We have an encode function and a decode function. AWS Lambda schedules each of those for us. It has a load balancer and all the fancy stuff, depending on the demand. The most valuable part of AWS Lambda is that I only need to write the software. I need to write two functions, and my cloud developer turns them into two AWS Lambda instances. That's it."
"It's a fairly easy solution to learn."
"It is serverless and scalable. It can scale infinitely. You don't have to worry about the size of the servers that you're pre-allocating. You don't have to build server scale-out models. Auto scale and other similar features are just inherent in Lambda. So, for atomic and fairly non-persistent transactional units of work, Lambda works very well."
"The serverless computing feature eliminates the need to manage servers, provision, or scale."
 

Cons

"Apache Spark lacks geospatial data."
"The logging for the observability platform could be better."
"Apache Spark is very difficult to use. It would require a data engineer. It is not available for every engineer today because they need to understand the different concepts of Spark, which is very, very difficult and it is not easy to learn."
"The management tools could use improvement. Some of the debugging tools need some work as well. They need to be more descriptive."
"The migration of data between different versions could be improved."
"If you have a Spark session in the background, sometimes it's very hard to kill these sessions because of D allocation."
"The setup I worked on was really complex."
"Apache Spark should add some resource management improvements to the algorithms."
"There's room for improvement in the testing setup."
"The metrics and reporting for this solution could be improved."
"AWS Lambda is a bit difficult to set up if someone doesn't know how to code."
"It would be ideal if we could use the solution across different platforms."
"We need to better understand Lambda for different scenarios. We need some joint effort between Amazon and the users to have the users identify how they can really leverage Lambda. It's not about Lambda itself; it's about the practice, the guidance. There needs to be very good documentation. From the user perspective, what exists now is not always enough."
"The feature to attach external storage, such as an S3 or elastic storage, must be added to AWS Lambda. This is its area for improvement."
"The first time Lambda is started up, it takes some time to spin up an instance for serving the consumer requests. AWS has been trying to solve this in a variety of ways but have not yet managed to do so."
"It could be cheaper."
 

Pricing and Cost Advice

"On the cloud model can be expensive as it requires substantial resources for implementation, covering on-premises hardware, memory, and licensing."
"Since we are using the Apache Spark version, not the data bricks version, it is an Apache license version, the support and resolution of the bug are actually late or delayed. The Apache license is free."
"Considering the product version used in my company, I feel that the tool is not costly since the product is available for free."
"It is quite expensive. In fact, it accounts for almost 50% of the cost of our entire project."
"It is an open-source solution, it is free of charge."
"I did not pay anything when using the tool on cloud services, but I had to pay on the compute side. The tool is not expensive compared with the benefits it offers. I rate the price as an eight out of ten."
"Apache Spark is an open-source solution, and there is no cost involved in deploying the solution on-premises."
"Apache Spark is an expensive solution."
"The pricing varies based on the specific solution you're implementing, and in comparison to the value it provides, the overall cost is reasonable."
"AWS is slightly more expensive than Azure."
"AWS Lambda is not expensive for micro testing but is expensive if used for long deployment or long services."
"We only need to pay for the compute time our code consumes."
"AWS Lambda is a very inexpensive solution. They charge for the number of times we run it. If you run AWS Lambda for one time, they charge around 50 cents or 25 cents for the use. I don't know the exact price, but it's less than a dollar."
"It costs maybe less than $10 per month in my use case."
"You're not paying for a server if you're not using it, which is another reason I like it. So, you're not paying if you're not using it. It scales, and you're charged based on usage. It all depends on the use case. Some can be extremely inexpensive if you have very low volume transaction rates. That way, you don't have to fire up and absorb the cost of the servers just sitting there waiting for a transaction to come through. You're only paying when you use it. So, depending upon the use model, Lambda could be highly efficient relative to an EC2 solution. You don't have to have things reallocated."
"The price is expensive and is based on usage. The more users you have the higher the cost."
report
Use our free recommendation engine to learn which Compute Service solutions are best for your needs.
837,501 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
27%
Computer Software Company
13%
Manufacturing Company
7%
Comms Service Provider
5%
Educational Organization
67%
Financial Services Firm
8%
Computer Software Company
4%
Manufacturing Company
3%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

What do you like most about Apache Spark?
We use Spark to process data from different data sources.
What is your experience regarding pricing and costs for Apache Spark?
Compared to other solutions like Doc DB, Spark is more costly due to the need for extensive infrastructure. It requires significant investment in infrastructure, which can be expensive. While cloud...
What needs improvement with Apache Spark?
The Spark solution could improve in scheduling tasks and managing dependencies. Spark alone cannot handle sequential tasks, requiring environments like Airflow scheduler or scripts. For instance, o...
Which is better, AWS Lambda or Batch?
AWS Lambda is a serverless solution. It doesn’t require any infrastructure, which allows for cost savings. There is no setup process to deal with, as the entire solution is in the cloud. If you use...
What do you like most about AWS Lambda?
The tool scales automatically based on the number of incoming requests.
What is your experience regarding pricing and costs for AWS Lambda?
AWS Lambda is cheaper compared to running an instance continuously. You only pay for what you use, making it cost-effective.
 

Comparisons

 

Overview

 

Sample Customers

NASA JPL, UC Berkeley AMPLab, Amazon, eBay, Yahoo!, UC Santa Cruz, TripAdvisor, Taboola, Agile Lab, Art.com, Baidu, Alibaba Taobao, EURECOM, Hitachi Solutions
Netflix
Find out what your peers are saying about AWS Lambda vs. Apache Spark and other solutions. Updated: January 2025.
837,501 professionals have used our research since 2012.