Amazon EMR Reviews

Name: Amazon EMR
Brand: Amazon Web Services (AWS)
Rating: 3.9 (23 reviews)

3.9 out of 5

23 reviews
86% willing to recommend

1,124 followers

What is Amazon EMR?

Amazon Elastic MapReduce (Amazon EMR) is a web service that makes it easy to quickly and cost-effectively process vast amounts of data. Amazon EMR simplifies big data processing, providing a managed Hadoop framework that makes it easy, fast, and cost-effective for you to distribute and process vast amounts of your data across dynamically scalable Amazon EC2 instances.

Get the Amazon EMR Buyer's Guide and find out what your peers are saying about Amazon EMR, Apache Spark, Cloudera Distribution for Hadoop and more!

Amazon EMR is the #3 ranked solution in top Hadoop solutions and #12 ranked solution in top Cloud Data Warehouse solutions. PeerSpot users give Amazon EMR an average rating of 7.8 out of 10. Amazon EMR is most commonly compared to Apache Spark: Amazon EMR vs Apache Spark. Amazon EMR is popular among the large enterprise segment, accounting for 73% of users researching this solution on PeerSpot. The top industry researching this solution are professionals from a financial services firm, accounting for 26% of all views.

Helped 847,862 peers since 2012

Featured Amazon EMR reviews

Prashant Singh

Vice President -Product Management at Paytm

Amazon EMR has multiple connectors that can connect to various data sources. The service charges are based on processing only, depending on the resources used, which can help save money. It is easy to integrate with other services for storage, allowing data to be shifted to cheaper storage based on usage.

Read full review

Ilya Afanasyev

Senior Software Development Engineer at Yahoo!

The problem for us is it starts very slow. They need to improve the start time. If we use a long-running EMR, it costs a lot of money. However, when we start, for example, a job, if the job runs for one hour, it's normal as it starts in about ten minutes. If we want, for example, to run each five minutes, it's a problem if it takes ten minutes to start. It's a little bit weird that you cannot use the service within a short period. The support could be better.

Read full review

Quan Vu

Data Governance Manager at VPBFC

We need to have a data pipeline tool to ensure consistent data processing for the initial setup. We create a framework, read the code, and execute it in a data catalog. The size of the maintenance team depends on the project and the use cases. Usually, one backup team of four or five DevOps executives takes care of the backend and database. We need to separate our environments into production and development. We use GitHub for source control, Jenkins for the deployment pipeline, and a standard CI/CD tool to deploy code changes into production. We need to develop a deployment framework so developers only need to provide the code for their projects. The underlying engine then deploys the code, reads it, addresses the EMR filter, executes it, and completes the data processing.

Read full review

Amazon EMR mindshare

Product category:

As of April 2025, the mindshare of Amazon EMR in the Hadoop category stands at 13.3%, down from 17.1% compared to the previous year, according to calculations based on PeerSpot user engagement data.

Hadoop

PeerAnalyst reports based on Amazon EMR reviews

Type	Title	Date
Category	Hadoop	Apr 16, 2025	Download
Product	Reviews, tips, and advice from real users	Apr 16, 2025	Download
Comparison	Amazon EMR vs Apache Spark	Apr 16, 2025	Download
Comparison	Amazon EMR vs Cloudera Distribution for Hadoop	Apr 16, 2025	Download
Comparison	Amazon EMR vs HPE Ezmeral Data Fabric	Apr 16, 2025	Download

Title	Rating	Mindshare	Recommending
Apache Spark	4.2	17.5%	90%	65 interviews Add to research
Cloudera Distribution for Hadoop	4.0	25.0%	92%	50 interviews Add to research

Valuable Features

Amazon EMR is highly scalable, reliable, and cost-effective, utilizing Amazon EC2 and S3 for cloud storage. It supports auto-scaling and easy integration with Hadoop, HDFS, and various tools like Spark, Hive, and Flink. The platform offers managed services, reducing hardware management. Users benefit from its processing speed, data storage capacity, and security features. It supports frameworks for managing structured and unstructured data, facilitating data lakes, stores, and marts integration, with flexible real-time and batch processing capabilities.

"I rate Amazon EMR as ten out of ten."
"Amazon EMR has multiple connectors that can connect to various data sources."
"The security of the managed workflow and the managed services are the best features for us. Since we inherited their security model and it's all managed services, those are the key benefits for our clients."

Room for Improvement

Amazon EMR requires improvement in areas like user interface, web support, and cluster configuration. Users find steep learning curves and encounter version issues affecting stability and compatibility. There's a need for better cost optimization, faster start times, and enhanced monitoring and debugging features. Expanding platform integrations and automating provisioning and scaling could enhance efficiency. Security, pricing, and support services require enhancements, and adding newer technologies would increase flexibility and user control.

"There is room for improvement with respect to retries, handling the volume of data on S3 buckets, cluster provisioning, scaling, termination, security, and integration between services like S3, Glue, Lake Formation, and DynamoDB."
"Spark jobs take longer on Amazon EMR compared to previous experiences."
"The solution can become expensive if you are not careful."

ROI

Amazon EMR's ROI varies across vendors and usage cases. Some businesses report significant savings and high returns, particularly those transitioning from on-premise systems, often seeing returns of at least two to one. Feedback suggests cost savings up to 20%. Many have not calculated ROI specifics, but general sentiment indicates a positive impact with savings observed by companies leveraging Amazon EMR in their operations.

Pricing

Amazon EMR pricing experiences vary, emphasizing usage-based costs without fixed licensing fees. Users note potential high expenses driven by EC2 fees, infrastructure resources, and varying cluster usage. Despite being costly, some describe it as moderately priced and efficient for Big Data workloads. Cost optimization is achievable through strategic resource management and auto-scaling, but careful monitoring is crucial to avoid unexpected charges. Enterprise budgets range from $40,000 to over $1 million annually, influenced by service levels and support packages.

"I rate the tool's pricing a five out of ten. It can be expensive since it's a managed service, and if you are not careful, you can run into unexpected charges. You can make a mistake that costs you tens of thousands of dollars. That's happened to us twice, so I'm sensitive to it. We're still trying to work on that. Our smallest client probably spends a hundred thousand dollars yearly on licensing, while our largest is well over a million."
"The product is not cheap, but it is not expensive."
"Amazon EMR is not very expensive."

Popular Use Cases

Amazon EMR is utilized to run Spark scripts and build data lakes by accessing various data sources. Users deploy it for tech processing, big data frameworks, managing data pipelines, and enabling serverless architectures. It supports AI projects with on-the-fly algorithm execution and handles millions of roles swiftly. EMR integrates with AWS services like SageMaker and Airflow for tasks such as predictive analysis, developing workflows, resource management, and analytics reporting.

Service and Support

Amazon EMR's customer service and support show mixed feedback. Some users report inconsistency, with responses ranging from excellent to poor. Others emphasize satisfaction, noting responsiveness, knowledgeable staff, and quick resolutions. AWS Premier Partners enjoy enhanced support. Some users face challenges with integration or slower response times. Despite these issues, many express happiness with support, rating it high. Technical support is available 24/7, assisting with technical and operational queries through a proactive approach.

Deployment

Amazon EMR's initial setup is generally easy and quick, with some finding it more complex. While some users set up in about 30 minutes using AWS Console or Terraform, others experience longer timelines due to project requirements. Multi-factor authentication can pose challenges. Knowledge of big data simplifies the process, but documentation aids those unfamiliar. Larger teams handle setup efficiently, while scripts streamline deployment for production environments.

Scalability

Amazon EMR demonstrates strong scalability, accommodating various business needs, from small teams to enterprises with thousands of users. Users can choose clusters based on requirements and apply auto-scaling for efficiency. Instances can be adjusted for memory, storage, and GPU, though some experience delays during resource allocation. It is widely adopted for processing large datasets. Monitoring and configuring clusters is key for optimal performance. Many leverage EMR for analytics and report generation, supporting vast data volumes seamlessly.

Stability

Many users find Amazon EMR stable and reliable. They note no significant issues but mention the occasional need for reconfiguration due to data changes. Some users highlight its high availability and fault tolerance. Features like monitoring, updates, and disaster recovery contribute to its stability. While some express satisfaction, a few suggest improvements, rating its stability between 8.5 and 9 out of 10. Stability is further supported by availability zones and failover capabilities.

These insights are based on the in-depth reviews provided by peers to help you make a better buying decision.

Download our Amazon EMR Buyer's Guide for additional reliable information.