Try our new research platform with insights from 80,000+ expert users

Cloudera Distribution for Hadoop vs ScyllaDB comparison

 

Comparison Buyer's Guide

Executive Summary
 

Categories and Ranking

Cloudera Distribution for H...
Ranking in NoSQL Databases
7th
Average Rating
8.0
Reviews Sentiment
6.4
Number of Reviews
49
Ranking in other categories
Hadoop (2nd)
ScyllaDB
Ranking in NoSQL Databases
2nd
Average Rating
7.8
Number of Reviews
12
Ranking in other categories
No ranking in other categories
 

Mindshare comparison

As of November 2024, in the NoSQL Databases category, the mindshare of Cloudera Distribution for Hadoop is 2.4%, down from 2.7% compared to the previous year. The mindshare of ScyllaDB is 11.0%, up from 7.8% compared to the previous year. It is calculated based on PeerSpot user engagement data.
NoSQL Databases
 

Featured Reviews

Shahan Rehman - PeerSpot reviewer
Can host multiple technologies and help businesses with their AI initiatives
The ease or difficulty in setting up the product depends on the environment of the customer where the tool is deployed. If a banking, industrial, or retail sector firm is taken into concentration, depending on how big of a database is maintained, including the applications that are to be hosted, the deployment process can range from a simple to a very complex phase, depending on the architecture. For Cloudera Distribution for Hadoop, one has to go through the usual deployment process, like for any software product. You have to have different environments before going into production, like pre-production environments, test and dev environments. You install and configure all the components in the test environment and then test them on the pre-production environment. Once UAT is done, you move them to the production environment. In general, it's a critical product deployed in a company.
Uttam Giri - PeerSpot reviewer
Offers encryption and supports APIs, making it great for distributed systems
The best features of ScyllaDB are how it synchronizes data and its failover system. There's a unique formula to decide the number of nodes you need and the minimum required, which I find helpful. It also offers encryption and supports APIs, making it great for distributed systems and scaling databases across different regions. While it's easy to use, having prior experience helps configure it properly. There are many configurations; if you don't understand them, you might mess up the design. So, understanding your system's needs, like whether it requires more read or write operations, is crucial for setting up the correct configuration.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"The search function is the most valuable aspect of the solution."
"The data science aspect of the solution is valuable."
"The solution's most valuable feature is the enterprise data platform."
"Provides a viable open-source solution for enterprise implementations and reliable, intelligent data analysis."
"CDH has a wide variety of proprietary tools that we use, like Impala. So from that perspective, it's quite useful as opposed to something open-source. We get a lot of value from Cloudera's proprietary tools."
"The file system is a valuable feature."
"We had a data warehouse before all the data. We can process a lot more data structures."
"The tool's most interesting features are the distributed file system and unstructured data processing capability. Because we have a lot of unstructured data, like XML and social media logs, these features make it more valuable than the usual data warehousing solutions."
"It is lightweight, and it requires less infrastructure."
"The performance aspects of Scylla are good, as always... A good point about Scylla is that it can be used extensively."
"The product's most valuable features are efficiency and reliability."
"The documentation is good. It integrates easily with our existing data infrastructure."
"The performance and scalability are good, and we hardly see any major issues with ScyllaDB."
"Firstly, if I update something, it's most likely to finish within milliseconds."
"I like how fast it is to query data from the ScyllaDB node!"
"ScyllaDB is fast and reliable. It has good performance."
 

Cons

"The price of this solution could be lowered."
"The tool doesn't support reporting, and relational databases are still the major source of reporting data. Apache Iceberg will be launched soon within the Cloudera cluster for analytical purposes. The Cloudera Machine Learning aspect could be tuned and enhanced to enable us to host some predictive analytics machine learning and AI use cases."
"The user infrastructure and user interface needs to be improved, as well as the performance. The GUI needs to be better."
"Currently, we are using many other tools such as Spark and Blade Job to improve the performance."
"There is a maximum of a one-gigabyte block size, which is an area of storage that can be improved upon."
"The security of this solution could be improved. There should also be a way to basically have a blockchain enabled storage with the HDFS."
"There are multiple bugs when we update."
"It could be faster and more user-friendly."
"ScyllaDB needs to improve its handling of transactions."
"The documentation of Scylla is an area with shortcomings and needs to be improved."
"It seems we have better options available. So probably don't go for ScyllaDB. The reason is, first, it's very high. It's not as straightforward as, like, Postgres or ClickHouse to set up. It requires a complex setup."
"The product needs to add more features and improve the response time of the support team."
"We faced several challenges while integrating ScyllaDB into our AWS environment. One common issue was that a security port wasn’t opened on one node, preventingdata synchronization across clusters. We noticed the data wasn’t syncing correctly when we saw different record counts in other regions. After investigating, we found that the port was closed in one AWS region. Once we opened the port, the data synchronization across all nodes resumed as expected."
"From a sales pitch standpoint, it needs to deliver on promises of better ROI and compaction."
"Some of the regular commands in NoSQL do not work."
"Data export, along with how we can purchase the data periodically, needs to be improved so that the storage is within control. Then, we could optimize it even better."
 

Pricing and Cost Advice

"The product’s price depends from project to project."
"The solution is expensive."
"I wouldn't recommend CDH to others because of its high cost."
"I haven't bought a license for this solution. I'm only using the Apache license version."
"The price could be better for the product."
"Cloudera requires a license to use."
"The tool is not expensive."
"Cloudera Distribution for Hadoop is expensive, with support costs involved."
"I believe that there is a yearly licensing cost and that it's expensive."
"It's free."
"The paid version of ScyllaDB is not that expensive. The main advantage of the paid version is direct support from the ScyllaDB team, which can resolve issues faster—typically within a day, compared to two to three days with the free version. The paid version also offers better guidance and support, while the free version has good documentation and is more high-level. I’d rate their support team nine out of ten because of the quick responses from their community."
"It's a bit expensive."
"It is an expensive tool compared to its competitor."
report
Use our free recommendation engine to learn which NoSQL Databases solutions are best for your needs.
816,406 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
23%
Computer Software Company
15%
Educational Organization
10%
Manufacturing Company
8%
Computer Software Company
18%
Financial Services Firm
14%
Media Company
6%
Educational Organization
6%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

What do you like most about Cloudera Distribution for Hadoop?
The tool can be deployed using different container technologies, which makes it very scalable.
What is your experience regarding pricing and costs for Cloudera Distribution for Hadoop?
The tool is expensive. Overall, it's not a cheap software tool, and that is why only large enterprises who are mature enough and have an architecture that is complex enough opt for Cloudera, as its...
What needs improvement with Cloudera Distribution for Hadoop?
The tool doesn't support reporting, and relational databases are still the major source of reporting data. Apache Iceberg will be launched soon within the Cloudera cluster for analytical purposes. ...
What do you like most about Scylla?
The performance aspects of Scylla are good, as always... A good point about Scylla is that it can be used extensively.
What is your experience regarding pricing and costs for Scylla?
The paid version of ScyllaDB is not that expensive. The main advantage of the paid version is direct support from the ScyllaDB team, which can resolve issues faster—typically within a day, compared...
What needs improvement with Scylla?
We faced several challenges while integrating ScyllaDB into our AWS environment. One common issue was that a security port wasn’t opened on one node, preventingdata synchronization across clusters....
 

Learn More

 

Overview

 

Sample Customers

37signals, Adconion,adgooroo, Aggregate Knowledge, AMD, Apollo Group, Blackberry, Box, BT, CSC
IBM, Investing.com, mParticle, Comcast, GE, Fanatics, Ola, CERN, adgear, Samsung
Find out what your peers are saying about Cloudera Distribution for Hadoop vs. ScyllaDB and other solutions. Updated: October 2024.
816,406 professionals have used our research since 2012.