Try our new research platform with insights from 80,000+ expert users

AWS Lambda vs Apache Spark comparison

 

Comparison Buyer's Guide

Executive Summary

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Categories and Ranking

Apache Spark
Ranking in Compute Service
4th
Average Rating
8.4
Reviews Sentiment
7.7
Number of Reviews
65
Ranking in other categories
Hadoop (1st), Java Frameworks (2nd)
AWS Lambda
Ranking in Compute Service
1st
Average Rating
8.4
Reviews Sentiment
7.5
Number of Reviews
83
Ranking in other categories
No ranking in other categories
 

Mindshare comparison

As of April 2025, in the Compute Service category, the mindshare of Apache Spark is 11.2%, up from 9.7% compared to the previous year. The mindshare of AWS Lambda is 21.0%, down from 23.2% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Compute Service
 

Featured Reviews

Ilya Afanasyev - PeerSpot reviewer
Reliable, able to expand, and handle large amounts of data well
We use batch processing. It works well with our formats and file versions. There's a lot of functionality. In our pipeline each hour, we make a copy of data from MongoDB, of the changes from MongoDB to some specific file. Each time pipeline copied all of the data, it would do it each time without changes to all of the tables. Tables have a lot of data, and in the last MongoDB version, there is a possibility to read only changed data. This reduced the cost and configuration of the cluster, and we saved about $150,000. The solution is scalable. It's a stable product.
Wai L Lin O - PeerSpot reviewer
A serverless solution with easy integration features
We use AWS Lambda because it provides a solution for our needs without requiring us to manage our infrastructure. With the tool, we only pay for the resources we use. Additionally, it is straightforward to implement and integrates with other services like API Gateway. The tool's serverless nature has had the most significant impact on our workflow. I find it particularly attractive because it eliminates the need for managing servers. In my previous experience, managing upgrades and updates was quite challenging. The solution's integration process with other AWS services was relatively easy. We primarily use AWS services such as EventBridge for scheduling processes and log management.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"AI libraries are the most valuable. They provide extensibility and usability. Spark has a lot of connectors, which is a very important and useful feature for AI. You need to connect a lot of points for AI, and you have to get data from those systems. Connectors are very wide in Spark. With a Spark cluster, you can get fast results, especially for AI."
"Spark helps us reduce startup time for our customers and gives a very high ROI in the medium term."
"The most valuable feature of Apache Spark is its ease of use."
"Spark is used for transformations from large volumes of data, and it is usefully distributed."
"I like Apache Spark's flexibility the most. Before, we had one server that would choke up. With the solution, we can easily add more nodes when needed. The machine learning models are also really helpful. We use them to predict energy theft and find infrastructure problems."
"The most valuable feature of Apache Spark is its flexibility."
"We use Spark to process data from different data sources."
"With Spark, we parallelize our operations, efficiently accessing both historical and real-time data."
"The utilization of containers is particularly beneficial in overcoming the size limitations imposed on Lambda functions which not only allows us to work around these constraints but also contributes to the improvement and maintenance of our code."
"It's a fairly easy solution to learn."
"By using Lambda, we can use Python code and the Boto3 solution."
"You can spin up anything instantly without any investment."
"The main features of this solution are the ability to integrate multiple AWS applications or external applications very quickly and organize all of them. Additionally, it is easy to use and you can run various programming languages, such as Python, Go, and Java."
"The most valuable feature of this solution is the API Gateway."
"AWS Lambda's best features are log analysis and event triggering and actioning."
"It enables the launch of thousands of instances simultaneously,"
 

Cons

"At times during the deployment process, the tool goes down, making it look less robust. To take care of the issues in the deployment process, users need to do manual interventions occasionally."
"When using Spark, users may need to write their own parallelization logic, which requires additional effort and expertise."
"In data analysis, you need to take real-time data from different data sources. You need to process this in a subsecond, do the transformation in a subsecond, and all that."
"When you want to extract data from your HDFS and other sources then it is kind of tricky because you have to connect with those sources."
"There could be enhancements in optimization techniques, as there are some limitations in this area that could be addressed to further refine Spark's performance."
"It's not easy to install."
"Apache Spark could improve the connectors that it supports. There are a lot of open-source databases in the market. For example, cloud databases, such as Redshift, Snowflake, and Synapse. Apache Spark should have connectors present to connect to these databases. There are a lot of workarounds required to connect to those databases, but it should have inbuilt connectors."
"Stream processing needs to be developed more in Spark. I have used Flink previously. Flink is better than Spark at stream processing."
"We need to better understand Lambda for different scenarios. We need some joint effort between Amazon and the users to have the users identify how they can really leverage Lambda. It's not about Lambda itself; it's about the practice, the guidance. There needs to be very good documentation. From the user perspective, what exists now is not always enough."
"I would like the layers to have a bigger volume. I would like to be able to add more. I don't want to be limited by the layer."
"The product could make the process of integration easier."
"There's room for improvement in the testing setup."
"The metrics and reporting for this solution could be improved."
"Lamba functions have cold-starts that can cause some delay."
"The runtime for the solution can be improved."
"The first time Lambda is started up, it takes some time to spin up an instance for serving the consumer requests. AWS has been trying to solve this in a variety of ways but have not yet managed to do so."
 

Pricing and Cost Advice

"I did not pay anything when using the tool on cloud services, but I had to pay on the compute side. The tool is not expensive compared with the benefits it offers. I rate the price as an eight out of ten."
"Spark is an open-source solution, so there are no licensing costs."
"Apache Spark is not too cheap. You have to pay for hardware and Cloudera licenses. Of course, there is a solution with open source without Cloudera."
"The tool is an open-source product. If you're using the open-source Apache Spark, no fees are involved at any time. Charges only come into play when using it with other services like Databricks."
"Licensing costs can vary. For instance, when purchasing a virtual machine, you're asked if you want to take advantage of the hybrid benefit or if you prefer the license costs to be included upfront by the cloud service provider, such as Azure. If you choose the hybrid benefit, it indicates you already possess a license for the operating system and wish to avoid additional charges for that specific VM in Azure. This approach allows for a reduction in licensing costs, charging only for the service and associated resources."
"Apache Spark is an expensive solution."
"We are using the free version of the solution."
"Considering the product version used in my company, I feel that the tool is not costly since the product is available for free."
"It computes by the cycle, and it's very cheap."
"AWS is slightly more expensive than Azure."
"AWS Lambda cost is pretty decent."
"The price is expensive and is based on usage. The more users you have the higher the cost."
"This is a product that is pay-per-use, as opposed to a licensing fee."
"I think the price is okay. However, if they add more functionality, they can have better prices. In fact, they should have better and more flexible packages for clients who have greater consumption of Lambda."
"AWS Lambda is inexpensive."
"Price-wise, AWS Lambda is very cheap. It's not free, but it's not that expensive."
report
Use our free recommendation engine to learn which Compute Service solutions are best for your needs.
844,944 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
28%
Computer Software Company
13%
Manufacturing Company
8%
Comms Service Provider
5%
Educational Organization
69%
Financial Services Firm
7%
Computer Software Company
4%
Manufacturing Company
3%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

What do you like most about Apache Spark?
We use Spark to process data from different data sources.
What is your experience regarding pricing and costs for Apache Spark?
Compared to other solutions like Doc DB, Spark is more costly due to the need for extensive infrastructure. It requires significant investment in infrastructure, which can be expensive. While cloud...
What needs improvement with Apache Spark?
The Spark solution could improve in scheduling tasks and managing dependencies. Spark alone cannot handle sequential tasks, requiring environments like Airflow scheduler or scripts. For instance, o...
Which is better, AWS Lambda or Batch?
AWS Lambda is a serverless solution. It doesn’t require any infrastructure, which allows for cost savings. There is no setup process to deal with, as the entire solution is in the cloud. If you use...
What do you like most about AWS Lambda?
The tool scales automatically based on the number of incoming requests.
What is your experience regarding pricing and costs for AWS Lambda?
AWS Lambda is cheaper compared to running an instance continuously. You only pay for what you use, making it cost-effective.
 

Comparisons

 

Overview

 

Sample Customers

NASA JPL, UC Berkeley AMPLab, Amazon, eBay, Yahoo!, UC Santa Cruz, TripAdvisor, Taboola, Agile Lab, Art.com, Baidu, Alibaba Taobao, EURECOM, Hitachi Solutions
Netflix
Find out what your peers are saying about AWS Lambda vs. Apache Spark and other solutions. Updated: March 2025.
844,944 professionals have used our research since 2012.