Try our new research platform with insights from 80,000+ expert users

Cassandra vs Cloudera Distribution for Hadoop comparison

 

Comparison Buyer's Guide

Executive SummaryUpdated on Jan 7, 2025

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Categories and Ranking

Cassandra
Ranking in NoSQL Databases
4th
Average Rating
8.0
Reviews Sentiment
6.1
Number of Reviews
23
Ranking in other categories
Vector Databases (14th)
Cloudera Distribution for H...
Ranking in NoSQL Databases
8th
Average Rating
8.0
Reviews Sentiment
6.4
Number of Reviews
50
Ranking in other categories
Hadoop (2nd)
 

Mindshare comparison

As of March 2025, in the NoSQL Databases category, the mindshare of Cassandra is 11.5%, down from 12.6% compared to the previous year. The mindshare of Cloudera Distribution for Hadoop is 1.9%, down from 3.0% compared to the previous year. It is calculated based on PeerSpot user engagement data.
NoSQL Databases
 

Featured Reviews

Himanshu Amodwala - PeerSpot reviewer
Well-equipped to handle a massive influx of data and billions of requests
The use of Cassandra in real-time data analytics has been pivotal for our e-commerce platform. As our platform operates 24/7, providing services to sellers and customers alike, the need for real-time updates is paramount. For instance, when a customer leaves comments or feedback on an image, they anticipate an immediate reflection of these changes on the portal. Similarly, sellers altering product attributes or updating images expect instant visibility of these modifications. Handling large data volumes with Cassandra has been an excellent experience. Despite challenges related to the influx, these were not attributed to Cassandra itself but rather to middle-layer issues. Generally, it demonstrated scalability with workloads, thanks to its horizontal scaling capabilities. We could easily add new nodes to the system as needed, ensuring the platform coped well with increasing loads. The tool's most beneficial feature for scalability is its entire architecture. The absence of a single point of failure or a leader within the ecosystem contributes to its robust scalability. This key aspect influenced our decision to opt for the Cassandra ecosystem. In terms of performance, it demonstrated the ability to handle approximately 1.6 billion requests per day. This was achieved on AWS using EC2 instances, and it was during a period about four to five years ago.
Rok Dolinsek - PeerSpot reviewer
Enables on-premise implementation with powerful data processing capabilities
This is the only solution that is possible to install on-premise. Cloudera provides a hybrid solution that combines compute on cloud or on-premises. It includes all machine learning algorithms in the Spark machine learning library. All functionalities needed for a big data platform and ETL are on the platform, eliminating the need for other tools. It is scalable, ready for vertical scaling, and very powerful, offering numerous functionalities and configurations for generative AI.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"Since I haven't had years of experience with it, it's still new to me. One valuable feature is its distribution, so I can run it partly in the cloud and part on-prem. That's a feature I'd like to use but haven't yet because we're trying to move to Azure. I don't know if or when that will happen. Ideally, we'd have it distributed over the cloud and on-prem simultaneously, so if something happens to our on-prem, we can keep going in the cloud, like a pay-as-you-go model with Azure."
"Some of the valued features of this solution are it has good performance and failover."
"The time series data was one of the best features along with auto publishing."
"I am satisfied with the performance."
"The most valuable features are the counter features and the NoSQL schema. It also has good scalability. You can scale Cassandra to any finite level."
"The most valuable feature of Cassandra is its fast retrieval. Additionally, the solution can handle large amounts of data. It is the quickest application we use."
"The most valuable features of this solution are its speed and distributed nature."
"The most valuable features of Cassandra are its scaling capabilities and its non-SQL nature capabilities."
"The solution is stable."
"The product as a whole is good."
"Cloudera, as a whole, is designed to provide organizations with solutions for big data."
"With a cluster available, you can manage the security layer using the shared SDX - it provides flexibility."
"I don't see any performance issues."
"The features I find most valuable is that the solution is that it is easy to install and to work with. It starts with the installation and from there on the management is very simple and centralized."
"The tool can be deployed using different container technologies, which makes it very scalable."
"Very good end-to-end security features."
 

Cons

"There were challenges with the query language and the development interface. The query language, in particular, could be improved for better optimization. These challenges were encountered while using the Java SDK."
"Doesn't support a solution that can give aggregation."
"Interface is not user friendly."
"It can be difficult to analyze what's going on inside of the database relative to other databases. It can also be difficult to troubleshoot sometimes."
"Cassandra is very complex to manage. Sometimes, I need to involve a senior DevOps engineer if we encounter a problem."
"Cassandra can improve by adding more built-in tools. For example, if you want to do some maintenance activities in the cluster, we have to depend on third-party tools. Having these tools build-in would be e benefit."
"Fine-tuning was a bit of a challenge."
"There could be more integration, and it could be more user-friendly."
"It is quite complicated to configure and install. Integrating the platform into an information system is always a challenge, especially when starting with on-premise implementation."
"This is a very expensive solution."
"The pricing needs to improve."
"It is quite complicated to configure and install."
"The tool's ability to be deployed on a cloud model is an area of concern where improvements are required."
"Without the big data environment, we cannot store all of this data live. We have billions of records and terabytes of storage to be used. It's not an option actually for us to have a big data environment."
"Currently, we are using many other tools such as Spark and Blade Job to improve the performance."
"Cloudera's support is extremely bad and cannot be relied on."
 

Pricing and Cost Advice

"We pay for a license."
"I don't have the specific numbers on pricing, but it was fairly priced."
"Cassandra is a free open source solution, but there is a commercial version available called DataStax Enterprise."
"I use the tool's open-source version."
"There are licensing fees that must be paid, but I'm not sure if they are paid monthly or yearly."
"We are using the open-source version of Cassandra, the solution is free."
"When comparing with Oracle Sybase and SQL, it's cheaper. It's not expensive."
"The pricing must be improved."
"I believe we pay for a three-year license."
"I wouldn't recommend CDH to others because of its high cost."
"The solution is expensive."
"The price could be better for the product."
"The tool is expensive...For the SMB market or customers whose environments are not that complex and do not have multiple systems running, Cloudera might not be a good option."
"The tool is not expensive."
report
Use our free recommendation engine to learn which NoSQL Databases solutions are best for your needs.
839,422 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
20%
Computer Software Company
14%
Healthcare Company
8%
University
5%
Financial Services Firm
23%
Computer Software Company
14%
Educational Organization
12%
Manufacturing Company
8%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

What do you like most about Cassandra?
The use of Cassandra in real-time data analytics has been pivotal for our e-commerce platform. As our platform operates 24/7, providing services to sellers and customers alike, the need for real-ti...
What is your experience regarding pricing and costs for Cassandra?
For us, the key component on the developer side is that it's very easy. We are doing so instead of managing servers, we have used our on-premise installations. The cost for us is negligible.
What needs improvement with Cassandra?
We found some issues with the batch inserts when the data volume is large. Batch insertion is needed when I want to insert a million records at a time. We have some settings for batch inserting com...
What do you like most about Cloudera Distribution for Hadoop?
The tool can be deployed using different container technologies, which makes it very scalable.
What is your experience regarding pricing and costs for Cloudera Distribution for Hadoop?
The price for Cloudera is average, yet it is very good compared to other solutions. It can be deployed on-premises, unlike competitors' cloud-only solutions.
What needs improvement with Cloudera Distribution for Hadoop?
It is quite complicated to configure and install. Integrating the platform into an information system is always a challenge, especially when starting with on-premise implementation. Integrating wit...
 

Overview

 

Sample Customers

1. Apple 2. Netflix 3. Facebook 4. Instagram 5. Twitter 6. eBay 7. Spotify 8. Uber 9. Airbnb 10. Adobe 11. Cisco 12. IBM 13. Microsoft 14. Yahoo 15. Reddit 16. Pinterest 17. Salesforce 18. LinkedIn 19. Hulu 20. Airbnb 21. Walmart 22. Target 23. Sony 24. Intel 25. Cisco 26. HP 27. Oracle 28. SAP 29. GE 30. Siemens 31. Volkswagen 32. Toyota
37signals, Adconion,adgooroo, Aggregate Knowledge, AMD, Apollo Group, Blackberry, Box, BT, CSC
Find out what your peers are saying about Cassandra vs. Cloudera Distribution for Hadoop and other solutions. Updated: January 2025.
839,422 professionals have used our research since 2012.