Try our new research platform with insights from 80,000+ expert users

AWS Lambda vs Apache Spark comparison

 

Comparison Buyer's Guide

Executive Summary

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Categories and Ranking

Apache Spark
Ranking in Compute Service
4th
Average Rating
8.4
Reviews Sentiment
7.7
Number of Reviews
64
Ranking in other categories
Hadoop (1st), Java Frameworks (2nd)
AWS Lambda
Ranking in Compute Service
1st
Average Rating
8.4
Reviews Sentiment
7.5
Number of Reviews
83
Ranking in other categories
No ranking in other categories
 

Mindshare comparison

As of January 2025, in the Compute Service category, the mindshare of Apache Spark is 11.4%, up from 8.2% compared to the previous year. The mindshare of AWS Lambda is 20.4%, down from 27.0% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Compute Service
 

Featured Reviews

Ilya Afanasyev - PeerSpot reviewer
Reliable, able to expand, and handle large amounts of data well
We use batch processing. It works well with our formats and file versions. There's a lot of functionality. In our pipeline each hour, we make a copy of data from MongoDB, of the changes from MongoDB to some specific file. Each time pipeline copied all of the data, it would do it each time without changes to all of the tables. Tables have a lot of data, and in the last MongoDB version, there is a possibility to read only changed data. This reduced the cost and configuration of the cluster, and we saved about $150,000. The solution is scalable. It's a stable product.
Wai L Lin O - PeerSpot reviewer
A serverless solution with easy integration features
We use AWS Lambda because it provides a solution for our needs without requiring us to manage our infrastructure. With the tool, we only pay for the resources we use. Additionally, it is straightforward to implement and integrates with other services like API Gateway. The tool's serverless nature has had the most significant impact on our workflow. I find it particularly attractive because it eliminates the need for managing servers. In my previous experience, managing upgrades and updates was quite challenging. The solution's integration process with other AWS services was relatively easy. We primarily use AWS services such as EventBridge for scheduling processes and log management.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"One of the key features is that Apache Spark is a distributed computing framework. You can help multiple slaves and distribute the workload between them."
"It provides a scalable machine learning library."
"DataFrame: Spark SQL gives the leverage to create applications more easily and with less coding effort."
"The product’s most valuable features are lazy evaluation and workload distribution."
"I feel the streaming is its best feature."
"One of Apache Spark's most valuable features is that it supports in-memory processing, the execution of jobs compared to traditional tools is very fast."
"I found the solution stable. We haven't had any problems with it."
"The solution has been very stable."
"They have the built-in IDE, so everything happens without integration issues."
"The most valuable feature is that it scans the cloud system and if they are any security anomalies it triggers an email."
"We have no issues with the technical support."
"The most valuable feature of AWS Lambda, from a conceptual point, is its functions. For example, it's mathematical templates into which you can write, and create your solution. You write small pieces of a solution under given parameters."
"You can spin up anything instantly without any investment."
"The solution works for small applications. It is a serverless tool that is quick to spin up. We needn’t consider anything in the bag."
"The basic feature that I like is that there is no server installation. It also has good support for various languages, such as Java, .NET, C#, and Python."
"The solution is scalable."
 

Cons

"We've had problems using a Python process to try to access something in a large volume of data. It crashes if somebody gives me the wrong code because it cannot handle a large volume of data."
"Its UI can be better. Maintaining the history server is a little cumbersome, and it should be improved. I had issues while looking at the historical tags, which sometimes created problems. You have to separately create a history server and run it. Such things can be made easier. Instead of separately installing the history server, it can be made a part of the whole setup so that whenever you set it up, it becomes available."
"They could improve the issues related to programming language for the platform."
"We are building our own queries on Spark, and it can be improved in terms of query handling."
"I would like to see integration with data science platforms to optimize the processing capability for these tasks."
"In data analysis, you need to take real-time data from different data sources. You need to process this in a subsecond, do the transformation in a subsecond, and all that."
"When you first start using this solution, it is common to run into memory errors when you are dealing with large amounts of data."
"There were some problems related to the product's compatibility with a few Python libraries."
"There's room for improvement in the testing setup."
"The first time Lambda is started up, it takes some time to spin up an instance for serving the consumer requests. AWS has been trying to solve this in a variety of ways but have not yet managed to do so."
"The tool changes its UI every month which is very frustrating for me. I don’t know why AWS keeps changing the UI. They can’t stick to a specific one"
"AWS Lambda could be improved with better stability."
"I have seen some drawbacks with certain integrations."
"There are other similar solutions, such as Google Cloud Platform or Microsoft Azure. They might be better for small tasks."
"We can write anything as code, but the solution will not give proper error information."
"AWS Lambda can improve its file system-based sharing capabilities and restrictions."
 

Pricing and Cost Advice

"It is quite expensive. In fact, it accounts for almost 50% of the cost of our entire project."
"They provide an open-source license for the on-premise version."
"Apache Spark is open-source. You have to pay only when you use any bundled product, such as Cloudera."
"Apache Spark is an expensive solution."
"Considering the product version used in my company, I feel that the tool is not costly since the product is available for free."
"Apache Spark is an open-source solution, and there is no cost involved in deploying the solution on-premises."
"The tool is an open-source product. If you're using the open-source Apache Spark, no fees are involved at any time. Charges only come into play when using it with other services like Databricks."
"I did not pay anything when using the tool on cloud services, but I had to pay on the compute side. The tool is not expensive compared with the benefits it offers. I rate the price as an eight out of ten."
"The price of the solution is reasonable and it is a pay-per-use model. It is very good for cost optimization."
"AWS Lambda cost is pretty decent."
"The solution's price is average."
"Its pricing is on the higher side."
"You're not paying for a server if you're not using it, which is another reason I like it. So, you're not paying if you're not using it. It scales, and you're charged based on usage. It all depends on the use case. Some can be extremely inexpensive if you have very low volume transaction rates. That way, you don't have to fire up and absorb the cost of the servers just sitting there waiting for a transaction to come through. You're only paying when you use it. So, depending upon the use model, Lambda could be highly efficient relative to an EC2 solution. You don't have to have things reallocated."
"AWS Lambda is inexpensive."
"The solution follows a pay-as-you-go licensing model, which results in cost savings."
"The pricing is on-demand and based on runs or times that are billed out monthly."
report
Use our free recommendation engine to learn which Compute Service solutions are best for your needs.
831,158 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
27%
Computer Software Company
13%
Manufacturing Company
7%
University
5%
Educational Organization
64%
Financial Services Firm
9%
Computer Software Company
5%
Manufacturing Company
3%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

What do you like most about Apache Spark?
We use Spark to process data from different data sources.
What is your experience regarding pricing and costs for Apache Spark?
Compared to other solutions like Doc DB, Spark is more costly due to the need for extensive infrastructure. It requires significant investment in infrastructure, which can be expensive. While cloud...
What needs improvement with Apache Spark?
The main concern is the overhead of Java when distributed processing is not necessary. In such cases, operations can often be done on one node, making Spark's distributed mode unnecessary. Conseque...
Which is better, AWS Lambda or Batch?
AWS Lambda is a serverless solution. It doesn’t require any infrastructure, which allows for cost savings. There is no setup process to deal with, as the entire solution is in the cloud. If you use...
What do you like most about AWS Lambda?
The tool scales automatically based on the number of incoming requests.
What is your experience regarding pricing and costs for AWS Lambda?
AWS Lambda is cheaper compared to running an instance continuously. You only pay for what you use, making it cost-effective.
 

Comparisons

 

Overview

 

Sample Customers

NASA JPL, UC Berkeley AMPLab, Amazon, eBay, Yahoo!, UC Santa Cruz, TripAdvisor, Taboola, Agile Lab, Art.com, Baidu, Alibaba Taobao, EURECOM, Hitachi Solutions
Netflix
Find out what your peers are saying about AWS Lambda vs. Apache Spark and other solutions. Updated: January 2025.
831,158 professionals have used our research since 2012.