Altiscale vs Amazon EMR vs Apache Spark comparison

SAP and Amazon Web Services (AWS) are both solutions in the Hadoop category. SAP is ranked #21, while Amazon Web Services (AWS) is ranked #3 with an average rating of 8.0. SAP holds a 0.1% mindshare in H, compared to Amazon Web Services (AWS)’s 13.3% mindshare. Additionally, 86% of Amazon Web Services (AWS) users are willing to recommend the solution.

Altiscale

26 Views
5 Comparison Views

Amazon EMR

Read 23 Amazon EMR reviews

1,035 Views
872 Comparison Views

86% willing to recommend

Apache Spark

Read 66 Apache Spark reviews

1,404 Views
1,090 Comparison Views

90% willing to recommend

Altiscale

Amazon EMR

Apache Spark

Comparison Buyer's Guide

Download the report

Executive Summary

We performed a comparison between Altiscale, Amazon EMR, and Apache Spark based on real PeerSpot user reviews.

Find out what your peers are saying about Apache, Cloudera, Amazon Web Services (AWS) and others in Hadoop.

To learn more, read our detailed Hadoop Report (Updated: March 2025).

Buyer's Guide

Hadoop

March 2025

Download the complete report

Helped 849,335 peers since 2012

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:

Mindshare comparison

As of April 2025, in the Hadoop category, the mindshare of Altiscale is 0.1%, up from 0.1% compared to the previous year. The mindshare of Amazon EMR is 13.3%, down from 17.1% compared to the previous year. The mindshare of Apache Spark is 17.5%, down from 21.4% compared to the previous year. It is calculated based on PeerSpot user engagement data.

Hadoop

Featured Reviews

Use Altiscale?

Share your opinion

Prashant Singh

Vice President -Product Management at Paytm

Seamless data integration enhances reporting efficiency and an easy setup

Amazon EMR has multiple connectors that can connect to various data sources. The service charges are based on processing only, depending on the resources used, which can help save money. It is easy to integrate with other services for storage, allowing data to be shifted to cheaper storage based on usage.

Read full review

Ilya Afanasyev

Senior Software Development Engineer at Yahoo!

Reliable, able to expand, and handle large amounts of data well

We use batch processing. It works well with our formats and file versions. There's a lot of functionality. In our pipeline each hour, we make a copy of data from MongoDB, of the changes from MongoDB to some specific file. Each time pipeline copied all of the data, it would do it each time without changes to all of the tables. Tables have a lot of data, and in the last MongoDB version, there is a possibility to read only changed data. This reduced the cost and configuration of the cluster, and we saved about $150,000. The solution is scalable. It's a stable product.

Read full review

See which vendors are best for you

Use our free recommendation engine to learn which Hadoop solutions are best for your needs.

See recommendations

849,335 professionals have used our research since 2012.

Top Industries

By visitors reading reviews

No data available

Financial Services Firm

26%

Computer Software Company

14%

Educational Organization

Manufacturing Company

Financial Services Firm

27%

Computer Software Company

13%

Manufacturing Company

Comms Service Provider

Company Size

By reviewers

Large Enterprise

Midsize Enterprise

Small Business

No data available

Questions from the Community

Ask a question

Earn 20 points

What do you like most about Amazon EMR?

Amazon EMR is a good solution that can be used to manage big data.

See all answers

What is your experience regarding pricing and costs for Amazon EMR?

Compared to others, Amazon seems efficient and is considered good for Big Data workloads. Costs are involved based on...

See all answers

What needs improvement with Amazon EMR?

There is room for improvement with respect to retries, handling the volume of data on S3 ( /products/amazon-s3-review...

See all answers

What do you like most about Apache Spark?

We use Spark to process data from different data sources.

See all answers

What is your experience regarding pricing and costs for Apache Spark?

Compared to other solutions like Doc DB, Spark is more costly due to the need for extensive infrastructure. It requir...

See all answers

What needs improvement with Apache Spark?

The Spark solution could improve in scheduling tasks and managing dependencies. Spark alone cannot handle sequential ...

See all answers

Comparisons

No data available

Snowflake vs Amazon EMR

Compared 36% of the time

Cloudera Distribution for Hadoop vs Amazon EMR

Compared 28% of the time

Amazon Redshift vs Amazon EMR

Compared 8% of the time

AWS Lake Formation vs Amazon EMR

Compared 5% of the time

Azure Data Factory vs Amazon EMR

Compared 5% of the time

More Amazon EMR Competitors

Spring Boot vs Apache Spark

Compared 27% of the time

SAP HANA vs Apache Spark

Compared 12% of the time

AWS Batch vs Apache Spark

Compared 11% of the time

Cloudera Distribution for Hadoop vs Apache Spark

Compared 7% of the time

Jakarta EE vs Apache Spark

Compared 3% of the time

More Apache Spark Competitors

Product Reports

Buyer's Guide

Hadoop

March 2025

Download Altiscale product report

Buyer's Guide

Amazon EMR

April 2025

Download Amazon EMR product report

Buyer's Guide

Apache Spark

April 2025

Download Apache Spark product report

Also Known As

No data available

Amazon Elastic MapReduce

No data available

Overview

Liberate your data, by making it possible for any data analyst in your company to access and rapidly analyze data in the Hadoop data lake, without burdening your IT team. The Altiscale Insight Cloud handles both real-time and batch analytics, and connects easily to your favorite BI solutions.

SAP

Amazon Elastic MapReduce (Amazon EMR) is a web service that makes it easy to quickly and cost-effectively process vast amounts of data. Amazon EMR simplifies big data processing, providing a managed Hadoop framework that makes it easy, fast, and cost-effective for you to distribute and process vast amounts of your data across dynamically scalable Amazon EC2 instances.

Amazon Web Services (AWS)

Spark provides programmers with an application programming interface centered on a data structure called the resilient distributed dataset (RDD), a read-only multiset of data items distributed over a cluster of machines, that is maintained in a fault-tolerant way. It was developed in response to limitations in the MapReduce cluster computing paradigm, which forces a particular linear dataflowstructure on distributed programs: MapReduce programs read input data from disk, map a function across the data, reduce the results of the map, and store reduction results on disk. Spark's RDDs function as a working set for distributed programs that offers a (deliberately) restricted form of distributed shared memory

Apache

Sample Customers

Glu Mobile, Airpush, Devicescape, Visible Measures

Yelp

NASA JPL, UC Berkeley AMPLab, Amazon, eBay, Yahoo!, UC Santa Cruz, TripAdvisor, Taboola, Agile Lab, Art.com, Baidu, Alibaba Taobao, EURECOM, Hitachi Solutions

Find out what your peers are saying about Apache, Cloudera, Amazon Web Services (AWS) and others in Hadoop. Updated: March 2025.

DOWNLOAD NOW

849,335 professionals have used our research since 2012.