Try our new research platform with insights from 80,000+ expert users

Amazon EMR vs Cloudera Distribution for Hadoop comparison

 

Comparison Buyer's Guide

Executive Summary
 

Categories and Ranking

Amazon EMR
Ranking in Hadoop
3rd
Average Rating
7.8
Number of Reviews
21
Ranking in other categories
Cloud Data Warehouse (11th)
Cloudera Distribution for H...
Ranking in Hadoop
2nd
Average Rating
8.0
Reviews Sentiment
6.4
Number of Reviews
49
Ranking in other categories
NoSQL Databases (7th)
 

Mindshare comparison

As of November 2024, in the Hadoop category, the mindshare of Amazon EMR is 14.4%, down from 18.9% compared to the previous year. The mindshare of Cloudera Distribution for Hadoop is 27.1%, up from 22.7% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Hadoop
 

Featured Reviews

Quan Vu - PeerSpot reviewer
Provides efficient data processing features and has good scalability
We need to have a data pipeline tool to ensure consistent data processing for the initial setup. We create a framework, read the code, and execute it in a data catalog. The size of the maintenance team depends on the project and the use cases. Usually, one backup team of four or five DevOps executives takes care of the backend and database. We need to separate our environments into production and development. We use GitHub for source control, Jenkins for the deployment pipeline, and a standard CI/CD tool to deploy code changes into production. We need to develop a deployment framework so developers only need to provide the code for their projects. The underlying engine then deploys the code, reads it, addresses the EMR filter, executes it, and completes the data processing.
Shahan Rehman - PeerSpot reviewer
Can host multiple technologies and help businesses with their AI initiatives
The ease or difficulty in setting up the product depends on the environment of the customer where the tool is deployed. If a banking, industrial, or retail sector firm is taken into concentration, depending on how big of a database is maintained, including the applications that are to be hosted, the deployment process can range from a simple to a very complex phase, depending on the architecture. For Cloudera Distribution for Hadoop, one has to go through the usual deployment process, like for any software product. You have to have different environments before going into production, like pre-production environments, test and dev environments. You install and configure all the components in the test environment and then test them on the pre-production environment. Once UAT is done, you move them to the production environment. In general, it's a critical product deployed in a company.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"The project management is very streamlined."
"The initial setup is straightforward."
"The security of the managed workflow and the managed services are the best features for us. Since we inherited their security model and it's all managed services, those are the key benefits for our clients."
"Amazon EMR's most valuable features are processing speed and data storage capacity."
"The initial setup is pretty straightforward."
"The ability to resize the cluster is what really makes it stand out over other Hadoop and big data solutions."
"When we grade big jobs from on-prem to the cloud, we do it in EMR with Spark."
"Amazon EMR is a good solution that can be used to manage big data."
"Cloudera is a very manageable solution with good support."
"The product as a whole is good."
"The most valuable feature is Impala, the querying engine, which is very fast."
"With a cluster available, you can manage the security layer using the shared SDX - it provides flexibility."
"I don't see any performance issues."
"The data science aspect of the solution is valuable."
"The most valuable feature is that I can use CDH for almost all use cases across all industries, including the financial sector, public sector, private retailers, and so on."
"In terms of scalability, if you have enough hardware you can scale out. Scalability doesn't have any issues."
 

Cons

"There is no need to pay extra for third-party software."
"The dashboard management could be better. Right now, it's lacking a bit."
"We don't have much control. If we have multiple users, if they want to scale up, the cost will go and increase and we don't know how we can restrict that price part."
"The product must add some of the latest technologies to provide more flexibility to the users."
"Amazon EMR can improve by adding some features, such as megastore services and HiveServer2. Additionally, the user interface could be better, similar to what Apache service provides, cross-platform services."
"The solution can become expensive if you are not careful."
"As people are shifting from legacy solutions to other technologies, Amazon EMR needs to add more features that give more flexibility in managing user data."
"Amazon EMR is continuously improving, but maybe something like CI/CD out-of-the-box or integration with Prometheus Grafana."
"The procedure for operations could be simplified."
"The competitors provide better functionalities."
"There are better solutions out there that have more features than this one."
"The tool's ability to be deployed on a cloud model is an area of concern where improvements are required."
"The initial setup of Cloudera is difficult."
"We experienced many issues when we started working with Hadoop 3.0 in the Cloudera 6.0 version, so there is a lot of things that need to improve."
"The price of this solution could be lowered."
"There is a maximum of a one-gigabyte block size, which is an area of storage that can be improved upon."
 

Pricing and Cost Advice

"You don't need to pay for licensing on a yearly or monthly basis, you only pay for what you use, in terms of underlying instances."
"Amazon EMR's price is reasonable."
"The product is not cheap, but it is not expensive."
"There is a small fee for the EMR system, but major cost components are the underlying infrastructure resources which we actually use."
"The price of the solution is expensive."
"There is no need to pay extra for third-party software."
"I rate the tool's pricing a five out of ten. It can be expensive since it's a managed service, and if you are not careful, you can run into unexpected charges. You can make a mistake that costs you tens of thousands of dollars. That's happened to us twice, so I'm sensitive to it. We're still trying to work on that. Our smallest client probably spends a hundred thousand dollars yearly on licensing, while our largest is well over a million."
"Amazon EMR is not very expensive."
"I believe we pay for a three-year license."
"The solution is fairly expensive."
"Cloudera Distribution for Hadoop is expensive, with support costs involved."
"Cloudera requires a license to use."
"The price could be better for the product."
"The tool is expensive...For the SMB market or customers whose environments are not that complex and do not have multiple systems running, Cloudera might not be a good option."
"The tool is not expensive."
"The product’s price depends from project to project."
report
Use our free recommendation engine to learn which Hadoop solutions are best for your needs.
816,406 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
25%
Computer Software Company
13%
Manufacturing Company
9%
Educational Organization
7%
Financial Services Firm
23%
Computer Software Company
15%
Educational Organization
10%
Manufacturing Company
8%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

What do you like most about Amazon EMR?
Amazon EMR is a good solution that can be used to manage big data.
What is your experience regarding pricing and costs for Amazon EMR?
I rate the tool's pricing a five out of ten. It can be expensive since it's a managed service, and if you are not careful, you can run into unexpected charges. You can make a mistake that costs you...
What needs improvement with Amazon EMR?
The solution can become expensive if you are not careful.
What do you like most about Cloudera Distribution for Hadoop?
The tool can be deployed using different container technologies, which makes it very scalable.
What is your experience regarding pricing and costs for Cloudera Distribution for Hadoop?
The tool is expensive. Overall, it's not a cheap software tool, and that is why only large enterprises who are mature enough and have an architecture that is complex enough opt for Cloudera, as its...
What needs improvement with Cloudera Distribution for Hadoop?
The tool doesn't support reporting, and relational databases are still the major source of reporting data. Apache Iceberg will be launched soon within the Cloudera cluster for analytical purposes. ...
 

Also Known As

Amazon Elastic MapReduce
No data available
 

Overview

 

Sample Customers

Yelp
37signals, Adconion,adgooroo, Aggregate Knowledge, AMD, Apollo Group, Blackberry, Box, BT, CSC
Find out what your peers are saying about Amazon EMR vs. Cloudera Distribution for Hadoop and other solutions. Updated: October 2024.
816,406 professionals have used our research since 2012.