Try our new research platform with insights from 80,000+ expert users

Cassandra vs Cloudera Distribution for Hadoop comparison

 

Comparison Buyer's Guide

Executive Summary
 

Categories and Ranking

Cassandra
Ranking in NoSQL Databases
5th
Average Rating
8.0
Reviews Sentiment
4.4
Number of Reviews
21
Ranking in other categories
Vector Databases (14th)
Cloudera Distribution for H...
Ranking in NoSQL Databases
7th
Average Rating
8.0
Reviews Sentiment
6.4
Number of Reviews
49
Ranking in other categories
Hadoop (2nd)
 

Mindshare comparison

As of November 2024, in the NoSQL Databases category, the mindshare of Cassandra is 13.6%, up from 12.0% compared to the previous year. The mindshare of Cloudera Distribution for Hadoop is 2.4%, down from 2.7% compared to the previous year. It is calculated based on PeerSpot user engagement data.
NoSQL Databases
 

Featured Reviews

Himanshu Amodwala - PeerSpot reviewer
Well-equipped to handle a massive influx of data and billions of requests
The use of Cassandra in real-time data analytics has been pivotal for our e-commerce platform. As our platform operates 24/7, providing services to sellers and customers alike, the need for real-time updates is paramount. For instance, when a customer leaves comments or feedback on an image, they anticipate an immediate reflection of these changes on the portal. Similarly, sellers altering product attributes or updating images expect instant visibility of these modifications. Handling large data volumes with Cassandra has been an excellent experience. Despite challenges related to the influx, these were not attributed to Cassandra itself but rather to middle-layer issues. Generally, it demonstrated scalability with workloads, thanks to its horizontal scaling capabilities. We could easily add new nodes to the system as needed, ensuring the platform coped well with increasing loads. The tool's most beneficial feature for scalability is its entire architecture. The absence of a single point of failure or a leader within the ecosystem contributes to its robust scalability. This key aspect influenced our decision to opt for the Cassandra ecosystem. In terms of performance, it demonstrated the ability to handle approximately 1.6 billion requests per day. This was achieved on AWS using EC2 instances, and it was during a period about four to five years ago.
Shahan Rehman - PeerSpot reviewer
Can host multiple technologies and help businesses with their AI initiatives
The ease or difficulty in setting up the product depends on the environment of the customer where the tool is deployed. If a banking, industrial, or retail sector firm is taken into concentration, depending on how big of a database is maintained, including the applications that are to be hosted, the deployment process can range from a simple to a very complex phase, depending on the architecture. For Cloudera Distribution for Hadoop, one has to go through the usual deployment process, like for any software product. You have to have different environments before going into production, like pre-production environments, test and dev environments. You install and configure all the components in the test environment and then test them on the pre-production environment. Once UAT is done, you move them to the production environment. In general, it's a critical product deployed in a company.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"The most valuable feature of Cassandra is its fast retrieval. Additionally, the solution can handle large amounts of data. It is the quickest application we use."
"Our primary use case for the solution is testing."
"Can achieve continuous data without a single downtime because of node to node ring architecture."
"The most valuable features are the counter features and the NoSQL schema. It also has good scalability. You can scale Cassandra to any finite level."
"The technical evaluation is very good."
"The solution's database capabilities are very good."
"Since I haven't had years of experience with it, it's still new to me. One valuable feature is its distribution, so I can run it partly in the cloud and part on-prem. That's a feature I'd like to use but haven't yet because we're trying to move to Azure. I don't know if or when that will happen. Ideally, we'd have it distributed over the cloud and on-prem simultaneously, so if something happens to our on-prem, we can keep going in the cloud, like a pay-as-you-go model with Azure."
"The most valuable features of this solution are its speed and distributed nature."
"Provides a viable open-source solution for enterprise implementations and reliable, intelligent data analysis."
"Very good end-to-end security features."
"With a cluster available, you can manage the security layer using the shared SDX - it provides flexibility."
"The tool can be deployed using different container technologies, which makes it very scalable."
"The tool's most interesting features are the distributed file system and unstructured data processing capability. Because we have a lot of unstructured data, like XML and social media logs, these features make it more valuable than the usual data warehousing solutions."
"The file system is a valuable feature."
"The data science aspect of the solution is valuable."
"The search function is the most valuable aspect of the solution."
 

Cons

"Cassandra could be more user-friendly like MongoDB."
"It can be difficult to analyze what's going on inside of the database relative to other databases. It can also be difficult to troubleshoot sometimes."
"The solution doesn't have joins between tables so you need other tools for that."
"There were challenges with the query language and the development interface. The query language, in particular, could be improved for better optimization. These challenges were encountered while using the Java SDK."
"Depending upon our schema, we can't make ORDER BY or GROUP BY clauses in the product."
"I want Cassandra to update its open-source version more quickly. It's already feature-rich, but I'd appreciate better integration with other NoSQL databases like MariaDB or MongoDB. If I ever need to work with customers or vendors using different NoSQL databases, having native integration in Cassandra would make managing and interacting with their databases much easier."
"Doesn't support a solution that can give aggregation."
"Maybe they can improve their performance in data fetching from a high volume of data sets."
"The price of this solution could be lowered."
"The solution does not support multiple languages very well and this means users need to create work-arounds to implement some solutions."
"The tool doesn't support reporting, and relational databases are still the major source of reporting data. Apache Iceberg will be launched soon within the Cloudera cluster for analytical purposes. The Cloudera Machine Learning aspect could be tuned and enhanced to enable us to host some predictive analytics machine learning and AI use cases."
"The governance aspect of the solution should be improved."
"Cloudera's support is extremely bad and cannot be relied on."
"I would like to see an improvement in how the solution helps me to handle the whole cluster."
"The competitors provide better functionalities."
"The one thing that we struggled with predominately was support. Because it was relatively new, support was always a big issue and I think it's still a bit of an ongoing concern with the team currently managing it."
 

Pricing and Cost Advice

"There are licensing fees that must be paid, but I'm not sure if they are paid monthly or yearly."
"I use the tool's open-source version."
"I don't have the specific numbers on pricing, but it was fairly priced."
"We pay for a license."
"We are using the open-source version of Cassandra, the solution is free."
"Cassandra is a free open source solution, but there is a commercial version available called DataStax Enterprise."
"The price is very high. The solution is expensive."
"When comparing with Oracle Sybase and SQL, it's cheaper. It's not expensive."
"The pricing must be improved."
"The solution is expensive."
"I wouldn't recommend CDH to others because of its high cost."
"The product’s price depends from project to project."
"The price could be better for the product."
"The tool is not expensive."
report
Use our free recommendation engine to learn which NoSQL Databases solutions are best for your needs.
816,636 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
20%
Computer Software Company
15%
Healthcare Company
7%
Manufacturing Company
5%
Financial Services Firm
23%
Computer Software Company
15%
Educational Organization
11%
Manufacturing Company
8%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

What do you like most about Cassandra?
The use of Cassandra in real-time data analytics has been pivotal for our e-commerce platform. As our platform operates 24/7, providing services to sellers and customers alike, the need for real-ti...
What is your experience regarding pricing and costs for Cassandra?
I am not familiar with the experience of pricing, setup cost, and licensing.
What needs improvement with Cassandra?
We experience configuration issues when accommodating the volumes we require, which often necessitates consultation with the Cassandra development team. This aspect is room for improvement. Additio...
What do you like most about Cloudera Distribution for Hadoop?
The tool can be deployed using different container technologies, which makes it very scalable.
What is your experience regarding pricing and costs for Cloudera Distribution for Hadoop?
The tool is expensive. Overall, it's not a cheap software tool, and that is why only large enterprises who are mature enough and have an architecture that is complex enough opt for Cloudera, as its...
What needs improvement with Cloudera Distribution for Hadoop?
The tool doesn't support reporting, and relational databases are still the major source of reporting data. Apache Iceberg will be launched soon within the Cloudera cluster for analytical purposes. ...
 

Learn More

 

Overview

 

Sample Customers

1. Apple 2. Netflix 3. Facebook 4. Instagram 5. Twitter 6. eBay 7. Spotify 8. Uber 9. Airbnb 10. Adobe 11. Cisco 12. IBM 13. Microsoft 14. Yahoo 15. Reddit 16. Pinterest 17. Salesforce 18. LinkedIn 19. Hulu 20. Airbnb 21. Walmart 22. Target 23. Sony 24. Intel 25. Cisco 26. HP 27. Oracle 28. SAP 29. GE 30. Siemens 31. Volkswagen 32. Toyota
37signals, Adconion,adgooroo, Aggregate Knowledge, AMD, Apollo Group, Blackberry, Box, BT, CSC
Find out what your peers are saying about Cassandra vs. Cloudera Distribution for Hadoop and other solutions. Updated: October 2024.
816,636 professionals have used our research since 2012.