Try our new research platform with insights from 80,000+ expert users

AWS Lambda vs Apache Spark comparison

 

Comparison Buyer's Guide

Executive Summary

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Categories and Ranking

Apache Spark
Ranking in Compute Service
4th
Average Rating
8.4
Reviews Sentiment
7.7
Number of Reviews
65
Ranking in other categories
Hadoop (1st), Java Frameworks (2nd)
AWS Lambda
Ranking in Compute Service
1st
Average Rating
8.4
Reviews Sentiment
7.5
Number of Reviews
83
Ranking in other categories
No ranking in other categories
 

Mindshare comparison

As of April 2025, in the Compute Service category, the mindshare of Apache Spark is 11.2%, up from 9.7% compared to the previous year. The mindshare of AWS Lambda is 21.0%, down from 23.2% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Compute Service
 

Featured Reviews

Ilya Afanasyev - PeerSpot reviewer
Reliable, able to expand, and handle large amounts of data well
We use batch processing. It works well with our formats and file versions. There's a lot of functionality. In our pipeline each hour, we make a copy of data from MongoDB, of the changes from MongoDB to some specific file. Each time pipeline copied all of the data, it would do it each time without changes to all of the tables. Tables have a lot of data, and in the last MongoDB version, there is a possibility to read only changed data. This reduced the cost and configuration of the cluster, and we saved about $150,000. The solution is scalable. It's a stable product.
Wai L Lin O - PeerSpot reviewer
A serverless solution with easy integration features
We use AWS Lambda because it provides a solution for our needs without requiring us to manage our infrastructure. With the tool, we only pay for the resources we use. Additionally, it is straightforward to implement and integrates with other services like API Gateway. The tool's serverless nature has had the most significant impact on our workflow. I find it particularly attractive because it eliminates the need for managing servers. In my previous experience, managing upgrades and updates was quite challenging. The solution's integration process with other AWS services was relatively easy. We primarily use AWS services such as EventBridge for scheduling processes and log management.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"The product's deployment phase is easy."
"Apache Spark provides a very high-quality implementation of distributed data processing."
"The fault tolerant feature is provided."
"Provides a lot of good documentation compared to other solutions."
"The solution has been very stable."
"It's easy to prepare parallelism in Spark, run the solution with specific parameters, and get good performance."
"With Hadoop-related technologies, we can distribute the workload with multiple commodity hardware."
"ETL and streaming capabilities."
"The support from AWS Lambda is very good, they are responsive."
"Thanks to this solution, we do not need to worry about hardware or resource utilization. It saves us time."
"I like that it's easy to use and maintain. Lambda is good and supports different platforms, so you don't need to worry about language or maintenance."
"I like the pay-for-what-you-use feature. This is the main reason why we use AWS Lambda. I don't have to manage servers; I just have to configure Lambda and expose it to an API gateway."
"The ability to scale up and down very quickly helps because we can maintain our system performance and business at a low cost."
"Provides a good, easy path from when you're using an AWS cluster."
"Some of the most valuable features are that it's easy to install and use. The performance is also good."
"The programming language and the integration with other AWS services are the most valuable features."
 

Cons

"Stream processing needs to be developed more in Spark. I have used Flink previously. Flink is better than Spark at stream processing."
"The solution needs to optimize shuffling between workers."
"In data analysis, you need to take real-time data from different data sources. You need to process this in a subsecond, do the transformation in a subsecond, and all that."
"The graphical user interface (UI) could be a bit more clear. It's very hard to figure out the execution logs and understand how long it takes to send everything. If an execution is lost, it's not so easy to understand why or where it went. I have to manually drill down on the data processes which takes a lot of time. Maybe there could be like a metrics monitor, or maybe the whole log analysis could be improved to make it easier to understand and navigate."
"The main concern is the overhead of Java when distributed processing is not necessary."
"Technical expertise from an engineer is required to deploy and run high-tech tools, like Informatica, on Apache Spark, making it an area where improvements are required to make the process easier for users."
"It's not easy to install."
"It should support more programming languages."
"Its price should be improved. Its pricing is on the higher side. I am not sure if it currently supports the Go language. If it doesn't support the Go language, they can introduce it."
"The price in general could always be better."
"The tool changes its UI every month which is very frustrating for me. I don’t know why AWS keeps changing the UI. They can’t stick to a specific one"
"We need to better understand Lambda for different scenarios. We need some joint effort between Amazon and the users to have the users identify how they can really leverage Lambda. It's not about Lambda itself; it's about the practice, the guidance. There needs to be very good documentation. From the user perspective, what exists now is not always enough."
"It currently requires manual user maintenance to upgrade and evaluate, and an automated provision for this would be beneficial."
"Amazon doesn't have enough local support based in our country."
"AWS Lambda has some size limitations in the code line, you can't do a couple of functions to do the job."
"AWS Lambda can improve its file system-based sharing capabilities and restrictions."
 

Pricing and Cost Advice

"Considering the product version used in my company, I feel that the tool is not costly since the product is available for free."
"It is an open-source platform. We do not pay for its subscription."
"It is an open-source solution, it is free of charge."
"The product is expensive, considering the setup."
"Apache Spark is an expensive solution."
"The tool is an open-source product. If you're using the open-source Apache Spark, no fees are involved at any time. Charges only come into play when using it with other services like Databricks."
"Since we are using the Apache Spark version, not the data bricks version, it is an Apache license version, the support and resolution of the bug are actually late or delayed. The Apache license is free."
"Apache Spark is an open-source tool."
"The solution is free of cost for the first year, and after that, it becomes expensive."
"The price is expensive and is based on usage. The more users you have the higher the cost."
"The solution follows a pay-as-you-go licensing model, which results in cost savings."
"For licensing, we pay a yearly subscription."
"The solution's price is average."
"AWS Lambda is cost-effective, with a minimal maintenance cost."
"AWS Lambda is a very inexpensive solution. They charge for the number of times we run it. If you run AWS Lambda for one time, they charge around 50 cents or 25 cents for the use. I don't know the exact price, but it's less than a dollar."
"We only need to pay for the compute time our code consumes."
report
Use our free recommendation engine to learn which Compute Service solutions are best for your needs.
845,040 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
28%
Computer Software Company
13%
Manufacturing Company
8%
Comms Service Provider
5%
Educational Organization
69%
Financial Services Firm
7%
Computer Software Company
4%
Manufacturing Company
3%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

What do you like most about Apache Spark?
We use Spark to process data from different data sources.
What is your experience regarding pricing and costs for Apache Spark?
Compared to other solutions like Doc DB, Spark is more costly due to the need for extensive infrastructure. It requires significant investment in infrastructure, which can be expensive. While cloud...
What needs improvement with Apache Spark?
The Spark solution could improve in scheduling tasks and managing dependencies. Spark alone cannot handle sequential tasks, requiring environments like Airflow scheduler or scripts. For instance, o...
Which is better, AWS Lambda or Batch?
AWS Lambda is a serverless solution. It doesn’t require any infrastructure, which allows for cost savings. There is no setup process to deal with, as the entire solution is in the cloud. If you use...
What do you like most about AWS Lambda?
The tool scales automatically based on the number of incoming requests.
What is your experience regarding pricing and costs for AWS Lambda?
AWS Lambda is cheaper compared to running an instance continuously. You only pay for what you use, making it cost-effective.
 

Comparisons

 

Overview

 

Sample Customers

NASA JPL, UC Berkeley AMPLab, Amazon, eBay, Yahoo!, UC Santa Cruz, TripAdvisor, Taboola, Agile Lab, Art.com, Baidu, Alibaba Taobao, EURECOM, Hitachi Solutions
Netflix
Find out what your peers are saying about AWS Lambda vs. Apache Spark and other solutions. Updated: March 2025.
845,040 professionals have used our research since 2012.