Try our new research platform with insights from 80,000+ expert users

Cassandra vs Cloudera Distribution for Hadoop comparison

 

Comparison Buyer's Guide

Executive SummaryUpdated on Jan 7, 2025

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Categories and Ranking

Cassandra
Ranking in NoSQL Databases
5th
Average Rating
8.0
Reviews Sentiment
6.2
Number of Reviews
22
Ranking in other categories
Vector Databases (14th)
Cloudera Distribution for H...
Ranking in NoSQL Databases
8th
Average Rating
8.0
Reviews Sentiment
6.4
Number of Reviews
49
Ranking in other categories
Hadoop (2nd)
 

Mindshare comparison

As of January 2025, in the NoSQL Databases category, the mindshare of Cassandra is 13.1%, up from 12.4% compared to the previous year. The mindshare of Cloudera Distribution for Hadoop is 2.3%, down from 3.0% compared to the previous year. It is calculated based on PeerSpot user engagement data.
NoSQL Databases
 

Featured Reviews

Himanshu Amodwala - PeerSpot reviewer
Well-equipped to handle a massive influx of data and billions of requests
The use of Cassandra in real-time data analytics has been pivotal for our e-commerce platform. As our platform operates 24/7, providing services to sellers and customers alike, the need for real-time updates is paramount. For instance, when a customer leaves comments or feedback on an image, they anticipate an immediate reflection of these changes on the portal. Similarly, sellers altering product attributes or updating images expect instant visibility of these modifications. Handling large data volumes with Cassandra has been an excellent experience. Despite challenges related to the influx, these were not attributed to Cassandra itself but rather to middle-layer issues. Generally, it demonstrated scalability with workloads, thanks to its horizontal scaling capabilities. We could easily add new nodes to the system as needed, ensuring the platform coped well with increasing loads. The tool's most beneficial feature for scalability is its entire architecture. The absence of a single point of failure or a leader within the ecosystem contributes to its robust scalability. This key aspect influenced our decision to opt for the Cassandra ecosystem. In terms of performance, it demonstrated the ability to handle approximately 1.6 billion requests per day. This was achieved on AWS using EC2 instances, and it was during a period about four to five years ago.
Miodrag-Stanic - PeerSpot reviewer
You can manage all services from one place in an integrated manner
We switched to Airflow because Cloudera is outdated. It's not widely used. It would be good if we had the Spark 3.5. Spark is quite old. Cloudera is now offering an alternate solution as a replacement for AWS. AWS works badly with small files. The solution is not fit for on-premise distributions. It should be containerized so we can deploy it as containers within Kubernetes. We had one upgrade from CDH to CDP, which lasted for a long time. And I would expect with containerized deployment, it would be upgraded much more quickly than we had the experience.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"I am getting much better performance than relational databases."
"We can add almost one million columns to the solution."
"The time series data was one of the best features along with auto publishing."
"Some of the valued features of this solution are it has good performance and failover."
"A consistent solution."
"The most valuable features are the counter features and the NoSQL schema. It also has good scalability. You can scale Cassandra to any finite level."
"Our primary use case for the solution is testing."
"The solution's database capabilities are very good."
"The search function is the most valuable aspect of the solution."
"The solution is reliable and stable, it fits our requirements."
"We experienced many issues when we started working with Hadoop 3.0 in the Cloudera 6.0 version, so there are a lot of things that need to improve. I believe they are working on that."
"The product as a whole is good."
"I don't see any performance issues."
"Very good end-to-end security features."
"The solution is stable."
"We had a data warehouse before all the data. We can process a lot more data structures."
 

Cons

"Interface is not user friendly."
"Doesn't support a solution that can give aggregation."
"Cassandra can improve by adding more built-in tools. For example, if you want to do some maintenance activities in the cluster, we have to depend on third-party tools. Having these tools build-in would be e benefit."
"We experience configuration issues when accommodating the volumes we require, which often necessitates consultation with the Cassandra development team."
"The secondary index in Cassandra was a bit problematic and could be improved."
"The solution is not easy to use because it is a big database and you have to learn the interface. This is the case though in most of these solutions."
"The solution is limited to a linear performance."
"There were challenges with the query language and the development interface. The query language, in particular, could be improved for better optimization. These challenges were encountered while using the Java SDK."
"The solution is not fit for on-premise distributions."
"The initial setup of Cloudera is difficult."
"While the deployed product is generally functional, there are instances where it presents difficulties."
"It would be useful if Cloudera had more tools like SQL Engines that offer the traditional relational database. We have to do a lot of work preparing the data outside Cloudera before getting it into the platform."
"Cloudera Distribution for Hadoop is not always completely stable in some cases, which can be a concern for big data solutions."
"Currently, we are using many other tools such as Spark and Blade Job to improve the performance."
"There are multiple bugs when we update."
"There are better solutions out there that have more features than this one."
 

Pricing and Cost Advice

"There are licensing fees that must be paid, but I'm not sure if they are paid monthly or yearly."
"We pay for a license."
"Cassandra is a free open source solution, but there is a commercial version available called DataStax Enterprise."
"I use the tool's open-source version."
"We are using the open-source version of Cassandra, the solution is free."
"I don't have the specific numbers on pricing, but it was fairly priced."
"I believe we pay for a three-year license."
"The pricing must be improved."
"I wouldn't recommend CDH to others because of its high cost."
"It is an expensive product."
"The price could be better for the product."
"The tool is not expensive."
"The price is very high. The solution is expensive."
"The solution is expensive."
report
Use our free recommendation engine to learn which NoSQL Databases solutions are best for your needs.
831,997 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
19%
Computer Software Company
15%
Healthcare Company
7%
University
5%
Financial Services Firm
23%
Computer Software Company
15%
Educational Organization
11%
Manufacturing Company
9%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

What do you like most about Cassandra?
The use of Cassandra in real-time data analytics has been pivotal for our e-commerce platform. As our platform operates 24/7, providing services to sellers and customers alike, the need for real-ti...
What is your experience regarding pricing and costs for Cassandra?
I am not familiar with the experience of pricing, setup cost, and licensing.
What needs improvement with Cassandra?
We experience configuration issues when accommodating the volumes we require, which often necessitates consultation with the Cassandra development team. This aspect is room for improvement. Additio...
What do you like most about Cloudera Distribution for Hadoop?
The tool can be deployed using different container technologies, which makes it very scalable.
What is your experience regarding pricing and costs for Cloudera Distribution for Hadoop?
The tool is expensive. Overall, it's not a cheap software tool, and that is why only large enterprises who are mature enough and have an architecture that is complex enough opt for Cloudera, as its...
What needs improvement with Cloudera Distribution for Hadoop?
The tool doesn't support reporting, and relational databases are still the major source of reporting data. Apache Iceberg will be launched soon within the Cloudera cluster for analytical purposes. ...
 

Overview

 

Sample Customers

1. Apple 2. Netflix 3. Facebook 4. Instagram 5. Twitter 6. eBay 7. Spotify 8. Uber 9. Airbnb 10. Adobe 11. Cisco 12. IBM 13. Microsoft 14. Yahoo 15. Reddit 16. Pinterest 17. Salesforce 18. LinkedIn 19. Hulu 20. Airbnb 21. Walmart 22. Target 23. Sony 24. Intel 25. Cisco 26. HP 27. Oracle 28. SAP 29. GE 30. Siemens 31. Volkswagen 32. Toyota
37signals, Adconion,adgooroo, Aggregate Knowledge, AMD, Apollo Group, Blackberry, Box, BT, CSC
Find out what your peers are saying about Cassandra vs. Cloudera Distribution for Hadoop and other solutions. Updated: January 2025.
831,997 professionals have used our research since 2012.