Try our new research platform with insights from 80,000+ expert users

AWS Lambda vs Apache Spark comparison

 

Comparison Buyer's Guide

Executive Summary

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Categories and Ranking

Apache Spark
Ranking in Compute Service
4th
Average Rating
8.4
Reviews Sentiment
7.7
Number of Reviews
65
Ranking in other categories
Hadoop (1st), Java Frameworks (2nd)
AWS Lambda
Ranking in Compute Service
1st
Average Rating
8.4
Reviews Sentiment
7.5
Number of Reviews
83
Ranking in other categories
No ranking in other categories
 

Mindshare comparison

As of March 2025, in the Compute Service category, the mindshare of Apache Spark is 11.3%, up from 9.2% compared to the previous year. The mindshare of AWS Lambda is 21.0%, down from 24.9% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Compute Service
 

Featured Reviews

Ilya Afanasyev - PeerSpot reviewer
Reliable, able to expand, and handle large amounts of data well
We use batch processing. It works well with our formats and file versions. There's a lot of functionality. In our pipeline each hour, we make a copy of data from MongoDB, of the changes from MongoDB to some specific file. Each time pipeline copied all of the data, it would do it each time without changes to all of the tables. Tables have a lot of data, and in the last MongoDB version, there is a possibility to read only changed data. This reduced the cost and configuration of the cluster, and we saved about $150,000. The solution is scalable. It's a stable product.
Wai L Lin O - PeerSpot reviewer
A serverless solution with easy integration features
We use AWS Lambda because it provides a solution for our needs without requiring us to manage our infrastructure. With the tool, we only pay for the resources we use. Additionally, it is straightforward to implement and integrates with other services like API Gateway. The tool's serverless nature has had the most significant impact on our workflow. I find it particularly attractive because it eliminates the need for managing servers. In my previous experience, managing upgrades and updates was quite challenging. The solution's integration process with other AWS services was relatively easy. We primarily use AWS services such as EventBridge for scheduling processes and log management.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"Apache Spark provides a very high-quality implementation of distributed data processing."
"Apache Spark is known for its ease of use. Compared to other available data processing frameworks, it is user-friendly."
"With Hadoop-related technologies, we can distribute the workload with multiple commodity hardware."
"Provides a lot of good documentation compared to other solutions."
"The good performance. The nice graphical management console. The long list of ML algorithms."
"The distribution of tasks, like the seamless map-reduce functionality, is quite impressive."
"One of the key features is that Apache Spark is a distributed computing framework. You can help multiple slaves and distribute the workload between them."
"The solution has been very stable."
"Because AWS Lambda is serverless, server configuration is not required, and we can run it directly anywhere."
"I think the most valuable feature is the agility of the solution."
"AWS Lambda has improved our productivity and functionality."
"The ease and speed of developing the services using any available language is the most valuable feature."
"The solution integrates well with API gateways and S3 events via its AWS ecosystem."
"I like the pay-for-what-you-use feature. This is the main reason why we use AWS Lambda. I don't have to manage servers; I just have to configure Lambda and expose it to an API gateway."
"The main features of this solution are the ability to integrate multiple AWS applications or external applications very quickly and organize all of them. Additionally, it is easy to use and you can run various programming languages, such as Python, Go, and Java."
"Lambda has improved our organization by making it possible to transform data."
 

Cons

"Apache Spark could improve the connectors that it supports. There are a lot of open-source databases in the market. For example, cloud databases, such as Redshift, Snowflake, and Synapse. Apache Spark should have connectors present to connect to these databases. There are a lot of workarounds required to connect to those databases, but it should have inbuilt connectors."
"For improvement, I think the tool could make things easier for people who aren't very technical. There's a significant learning curve, and I've seen organizations give up because of it. Making it quicker or easier for non-technical people would be beneficial."
"Needs to provide an internal schedule to schedule spark jobs with monitoring capability."
"There were some problems related to the product's compatibility with a few Python libraries."
"When you are working with large, complex tasks, the garbage collection process is slow and affects performance."
"The management tools could use improvement. Some of the debugging tools need some work as well. They need to be more descriptive."
"Apache Spark can improve the use case scenarios from the website. There is not any information on how you can use the solution across the relational databases toward multiple databases."
"Spark could be improved by adding support for other open-source storage layers than Delta Lake."
"Having a better preview would be helpful."
"The user-friendliness of the solution could be improved."
"What could be improved in AWS Lambda is a tricky question because I base the area for improvement on a specific matrix, for example, latency, so I'm still determining if I can be the judge on that. However, room for improvement could be when you're using AWS Lambda as a backend, it can be challenging to use it for monitoring. Monitoring is critical in development, and I don't have much expertise in the area, but you can use other services such as Xray. I found that monitoring on AWS Lambda is a challenge. The tool needs better monitoring. Another area for improvement in AWS Lambda is the cold start, where it takes some time to invoke a function the first time, but after that, invoking it becomes swift. Still, there's room for improvement in that AWS Lambda process. In the next release of AWS Lambda, I'd like AWS to improve monitoring so that I can monitor codes better."
"We face some problems with the event-driven execution model."
"Memory limitation is one of the weaknesses of AWS Lambda and as a result, we have to use several Lambda, instead of just one. Recently, I met with an Amazon employee, who is responsible for Lambda as a product. It appears Amazon has some plans with Lambda, so I don’t have to add something to the additional features."
"It could be cheaper."
"AWS Lambda is a bit difficult to set up if someone doesn't know how to code."
"AWS Lambda needs to improve its stability."
 

Pricing and Cost Advice

"We are using the free version of the solution."
"The tool is an open-source product. If you're using the open-source Apache Spark, no fees are involved at any time. Charges only come into play when using it with other services like Databricks."
"Apache Spark is not too cheap. You have to pay for hardware and Cloudera licenses. Of course, there is a solution with open source without Cloudera."
"Since we are using the Apache Spark version, not the data bricks version, it is an Apache license version, the support and resolution of the bug are actually late or delayed. The Apache license is free."
"Apache Spark is an open-source tool."
"It is quite expensive. In fact, it accounts for almost 50% of the cost of our entire project."
"Apache Spark is an open-source solution, and there is no cost involved in deploying the solution on-premises."
"Spark is an open-source solution, so there are no licensing costs."
"AWS Lambda is a very inexpensive solution. They charge for the number of times we run it. If you run AWS Lambda for one time, they charge around 50 cents or 25 cents for the use. I don't know the exact price, but it's less than a dollar."
"Lambda is an affordable solution. They offer free requests every month and charge per the compute time. If you are working in a big organization, usually AWS offer a savings plan where you get approximately 70% discount on pricing."
"The fees are volume-based."
"AWS Lambda is cheap."
"Its pricing is on the higher side."
"AWS Lambda cost is pretty decent."
"I would rate the tool’s pricing a nine out of ten. The solution’s pricing works on a pay-as-you-go basis."
"It computes by the cycle, and it's very cheap."
report
Use our free recommendation engine to learn which Compute Service solutions are best for your needs.
841,004 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
27%
Computer Software Company
13%
Manufacturing Company
7%
Comms Service Provider
5%
Educational Organization
68%
Financial Services Firm
8%
Computer Software Company
4%
Manufacturing Company
2%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

What do you like most about Apache Spark?
We use Spark to process data from different data sources.
What is your experience regarding pricing and costs for Apache Spark?
Compared to other solutions like Doc DB, Spark is more costly due to the need for extensive infrastructure. It requires significant investment in infrastructure, which can be expensive. While cloud...
What needs improvement with Apache Spark?
The Spark solution could improve in scheduling tasks and managing dependencies. Spark alone cannot handle sequential tasks, requiring environments like Airflow scheduler or scripts. For instance, o...
Which is better, AWS Lambda or Batch?
AWS Lambda is a serverless solution. It doesn’t require any infrastructure, which allows for cost savings. There is no setup process to deal with, as the entire solution is in the cloud. If you use...
What do you like most about AWS Lambda?
The tool scales automatically based on the number of incoming requests.
What is your experience regarding pricing and costs for AWS Lambda?
AWS Lambda is cheaper compared to running an instance continuously. You only pay for what you use, making it cost-effective.
 

Comparisons

 

Overview

 

Sample Customers

NASA JPL, UC Berkeley AMPLab, Amazon, eBay, Yahoo!, UC Santa Cruz, TripAdvisor, Taboola, Agile Lab, Art.com, Baidu, Alibaba Taobao, EURECOM, Hitachi Solutions
Netflix
Find out what your peers are saying about AWS Lambda vs. Apache Spark and other solutions. Updated: January 2025.
841,004 professionals have used our research since 2012.