Cloudera Distribution for Hadoop Reviews

Name: Cloudera Distribution for Hadoop
Brand: Cloudera
Rating: 4.0 (50 reviews)

4.0 out of 5

50 reviews
92% willing to recommend

What is Cloudera Distribution for Hadoop?

Cloudera Distribution for Hadoop is the world's most complete, tested, and popular distribution of Apache Hadoop and related projects. CDH is 100% Apache-licensed open source and is the only Hadoop solution to offer unified batch processing, interactive SQL, and interactive search, and role-based access controls. More enterprises have downloaded CDH than all other such distributions combined.

Get the Cloudera Distribution for Hadoop Buyer's Guide and find out what your peers are saying about Cloudera Distribution for Hadoop, MongoDB, Apache Spark and more!

Cloudera Distribution for Hadoop is the #2 ranked solution in top Hadoop solutions and #8 ranked solution in top NoSQL Databases. PeerSpot users give Cloudera Distribution for Hadoop an average rating of 8.0 out of 10. Cloudera Distribution for Hadoop is most commonly compared to MongoDB: Cloudera Distribution for Hadoop vs MongoDB. Cloudera Distribution for Hadoop is popular among the large enterprise segment, accounting for 73% of users researching this solution on PeerSpot. The top industry researching this solution are professionals from a financial services firm, accounting for 22% of all views.

Buyer's Guide

Cloudera Distribution for Hadoop

July 2025

Get the report

Helped 861,524 peers since 2012

Featured Cloudera Distribution for Hadoop reviews

Rok Dolinsek

Manager, Bussines Development & Co Owner at Troia d.o.o.

This is the only solution that is possible to install on-premise. Cloudera provides a hybrid solution that combines compute on cloud or on-premises. It includes all machine learning algorithms in the Spark machine learning library. All functionalities needed for a big data platform and ETL are on the platform, eliminating the need for other tools. It is scalable, ready for vertical scaling, and very powerful, offering numerous functionalities and configurations for generative AI.

Read full review

Mohamed-Saied

Senior Data Architect at Teradata Corporation

The tool's most interesting features are the distributed file system and unstructured data processing capability. Because we have a lot of unstructured data, like XML and social media logs, these features make it more valuable than the usual data warehousing solutions. Data warehouse solutions mainly use structured, regular, and formatted data, but Cloudera Distribution for Hadoop can handle unstructured data. This is the most interesting part. Also, the huge amount of data can be tuned in HDFS rather than relational databases. Cloudera Distribution for Hadoop can be a promising solution for distributed file systems, real-time processing, batch mode processing, AI, and machine learning use cases. We are using several security features in the solution. These include Linux's security implementations and its built-in firewall. We also rely on single sign-on and encryption—at rest and in transit—for sensitive data. It has access, ensuring that not everyone can use every service; for example, some users can access Hive, others Impala, and others hBase, depending on their privileges. We also use LDAP to track who registers or logs into the cluster. Additionally, we use key nodes to manage firewalls between Cloudera Manager or the Cloudera cluster and other data sources.

Read full review

Miodrag Milojevic

Senior Data Archirect at Yettel

Cloudera Distribution for Hadoop is not always completely stable in some cases, which can be a concern for big data solutions. Sometimes, there are problems with the network, and, of course, there can be communication issues with Active Directory or similar systems due to authorization scheduling, resulting in occasional problems. The implementation process is quite complex because of the schedules.

Read full review

Cloudera Distribution for Hadoop mindshare

Product category:

As of July 2025, the mindshare of Cloudera Distribution for Hadoop in the Hadoop category stands at 24.2%, down from 24.3% compared to the previous year, according to calculations based on PeerSpot user engagement data.

Hadoop

PeerResearch reports based on Cloudera Distribution for Hadoop reviews

Type	Title	Date
Category	Hadoop	Jul 15, 2025	Download
Product	Reviews, tips, and advice from real users	Jul 15, 2025	Download
Comparison	Cloudera Distribution for Hadoop vs Apache Spark	Jul 15, 2025	Download
Comparison	Cloudera Distribution for Hadoop vs Amazon EMR	Jul 15, 2025	Download
Comparison	Cloudera Distribution for Hadoop vs HPE Ezmeral Data Fabric	Jul 15, 2025	Download

Title	Rating	Mindshare	Recommending
MongoDB	4.1	N/A	92%	81 interviews Add to research
Apache Spark	4.2	18.3%	90%	66 interviews Add to research

Valuable Features

Users find Cloudera Distribution for Hadoop valuable for its Cloudera Manager, which simplifies administration. Features like Impala, Sentry, and fast data processing are highlighted. The integration and ease of use, robust security, enterprise-level capabilities, and comprehensive support are frequently mentioned. Its ability to manage large data, compatibility with cloud and on-premise environments, and scalable resources also stand out. Various tools for AI and machine learning, along with solid community support, are significant advantages.

"Cloudera provides a hybrid solution that combines compute on cloud or on-premises."
"This is the only solution that is possible to install on-premise."
"Cloudera, as a whole, is designed to provide organizations with solutions for big data."

Room for Improvement

HBase stability and speed are problematic, affecting real-time processing. API limitations and complex licensing need addressing. Apache Kudu, Spark SQL integration, and documentation require enhancements. Better multi-language support, UI improvements, cost reduction, and simplified setup are essential. Training materials are outdated. Integration with cloud and security features like data privacy need improvement. Performance is lacking compared to competitors, and the support team struggles with new releases and complex installations.

"It is quite complicated to configure and install. Integrating the platform into an information system is always a challenge, especially when starting with on-premise implementation."
"It is quite complicated to configure and install."
"The performance of some analytics engines provided by Cloudera is not that good."

ROI

Determining ROI from Cloudera Distribution for Hadoop is complex due to varied usage across divisions. Huawei notes that ROI is anticipated in the long term. There is difficulty in measuring direct ROI in analytics, with some departments unable to comment. A return on investment is reported by some users. Cloudera Distribution for Hadoop is challenging to evaluate, yet its value is significant with around 30 use cases recognized by organizations.

Pricing

Enterprise users find Cloudera Distribution for Hadoop generally expensive, with a steep licensing cost on a per-node basis. Costs include additional support fees, making it suitable for large enterprises that can afford them. Smaller businesses might find these costs prohibitive. Licensing can involve annual subscriptions and various levels depending on requirements. Some users appreciate its competitive pricing against specific alternatives but acknowledge the higher expense compared to open-source solutions. A free version and trial are available, with deployment flexibility as a notable feature.

"The tool is expensive...For the SMB market or customers whose environments are not that complex and do not have multiple systems running, Cloudera might not be a good option."
"The tool is not expensive."
"The solution is fairly expensive."

Popular Use Cases

Companies use Cloudera Distribution for Hadoop for various purposes such as big data analytics, external storage, data warehousing, data lakes, and advanced analytics. They collect, store, and process large datasets, utilize machine learning, and gain insights through dashboard reporting. Many integrate it with other tools like Spark, Hive, and HDFS. Common use cases include supporting analytical applications, managing real-time data, and enhancing infrastructure for analytics in domains like finance, telecom, and energy.

Service and Support

Cloudera Distribution for Hadoop receives generally positive feedback for customer service and technical support. Multiple sources indicate excellent or great responses, with active community participation and quick issue resolution. Support is praised for efficiency and responsiveness, often outperforming competitors. Some feedback highlights room for improvement, mentioning inexperience and escalated costs. Documentation is deemed sufficient, but there's some criticism regarding reliability. Capitalizing on quick access and quality support, they handle complex use cases effectively.

Deployment

Many users found Cloudera Distribution for Hadoop's initial setup easy, particularly with Cloudera Manager and cloud templates. Challenges arose with complex environments, security features, and user interfaces. Automation tools like Chef and preconfigured environments eased deployment. Installation complexity varied by customer environment, and expert knowledge was crucial. Some outsourced setup to vendors or faced issues with integration and upgrades. Proper planning and components configuration impacted setup difficulty.

Scalability

Users find Cloudera Distribution for Hadoop easy to scale with distributed architecture, supporting large installations. Its scalability is considered excellent, with users able to expand clusters by adding nodes. While scalability on the cloud may face challenges, on-premises environments scale effectively. Hardware availability can affect scaling timelines. Despite occasional complexities in scaling, Cloudera remains reliable and highly regarded for its capacity to grow alongside user needs. The presence of sufficient infrastructure aids scaling efforts.

Stability

Some found Cloudera Distribution for Hadoop unstable with frequent issues, while others had no problems, even praising its stability. Bugs in Cloudera 5 clusters and Flume were highlighted. Compatibility issues occurred with specialized hardware. Some users rated the stability highly, suggesting it fits their needs. Problems often stemmed from infrastructure rather than software, with sensitivity to hardware configurations noted. Support availability and maturity contributed positively to perceptions of stability.

These insights are based on the in-depth reviews provided by peers to help you make a better buying decision.

Download our Cloudera Distribution for Hadoop Buyer's Guide for additional reliable information.