Try our new research platform with insights from 80,000+ expert users

Cloudera DataFlow vs Databricks comparison

 

Comparison Buyer's Guide

Executive SummaryUpdated on Dec 17, 2024
 

Categories and Ranking

Cloudera DataFlow
Ranking in Streaming Analytics
14th
Average Rating
7.2
Reviews Sentiment
6.3
Number of Reviews
4
Ranking in other categories
No ranking in other categories
Databricks
Ranking in Streaming Analytics
1st
Average Rating
8.2
Reviews Sentiment
7.0
Number of Reviews
84
Ranking in other categories
Data Science Platforms (1st)
 

Mindshare comparison

As of December 2024, in the Streaming Analytics category, the mindshare of Cloudera DataFlow is 1.2%, down from 1.6% compared to the previous year. The mindshare of Databricks is 14.3%, up from 9.9% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Streaming Analytics
 

Featured Reviews

Júlio César Gomes Fonseca - PeerSpot reviewer
A stable solution that helps develop quality modules but needs to improve its programming language
The initial setup was not so difficult. The deployment took so long, at least one or two years, because the team has a project that aims to be exceptional in the future. It's good to say because the company is very good. It's a self-confirmation technical integration company. We have numerous reasons why reducing staff workload is beneficial. However, it is important to note that this does not directly apply to the application used. They will only do the service.
Dunstan Matekenya - PeerSpot reviewer
Process large-scale data sets and integrates with Apache Spark with notebook environment
Databricks integrates natively with Apache Spark, which I use as a processing engine for large-scale datasets. This native integration is one of its strengths. Another strength is that the platform makes it very easy to manage resources. For example, setting up a cluster of five or fifteen nodes is straightforward with Databricks. The notebook environment is also excellent, making it easy to perform various tasks.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"This solution is very scalable and robust."
"DataFlow's performance is okay."
"The initial setup was not so difficult"
"The most effective features are data management and analytics."
"Databricks has improved my organization by allowing us to transform data from sources to a different format and feed that to the analytics, business intelligence, and reporting teams. This tool makes it easy to do those kinds of things."
"The initial setup is pretty easy."
"Can cut across the entire ecosystem of open source technology to give an extra level of getting the transformatory process of the data."
"The setup was straightforward."
"I like the ability to use workspaces with other colleagues because you can work together even without seeing the other team's job."
"The initial setup phase of Databricks was good."
"Ability to work collaboratively without having to worry about the infrastructure."
"Databricks is based on a Spark cluster and it is fast. Performance-wise, it is great."
 

Cons

"It's an outdated legacy product that doesn't meet the needs of modern data analysts and scientists."
"Although their workflow is pretty neat, it still requires a lot of transformation coding; especially when it comes to Python and other demanding programming languages."
"It is not easy to use the R language. Though I don't know if it's possible, I believe it is possible, but it is not the best language for machine learning."
"The tool should improve its integration with other products."
"While Databricks is generally a robust solution, I have noticed a limitation with debugging in the Delta Live Table, which could be improved."
"There should be better integration with other platforms."
"I would like to see more documentation in terms of how an end-user could use it, and users like me can easily try it and implement use cases."
"Databricks has added some alerts and query functionality into their SQL persona, but the whole SQL persona, which is like a role, needs a lot of development. The alerts are not very flexible, and the query interface itself is not as polished as the notebook interface that is used through the data science and machine learning persona. It is clunky at present."
"I'm not the guy that I'm working with Databricks on a daily basis. I'm on the management team. However, my team tells me there are limitations with streaming events. The connectors work with a small set of platforms. For example, we can work with Kafka, but if we want to move to an event-driven solution from AWS, we cannot do it. We cannot connect to all the streaming analytics platforms, so we are limited in choosing the best one."
"The product should provide more advanced features in future releases."
"The interface of Databricks could be easier to use when compared to other solutions. It is not easy for non-data scientists. The user interface is important before we had to write code manually and as solutions move to "No code AI" it is critical that the interface is very good."
 

Pricing and Cost Advice

"DataFlow isn't expensive, but its value for money isn't great."
"We find Databricks to be very expensive, although this improved when we found out how to shut it down at night."
"Databricks is a very expensive solution. Pricing is an area that could definitely be improved. They could provide a lower end compute and probably reduce the price."
"I do not exactly know the costs, but one of our clients pays between $100 USD and $200 USD monthly."
"The price is okay. It's competitive."
"The pricing depends on the usage itself."
"Price-wise, I would rate Databricks a three out of five."
"We have only incurred the cost of our AWS cloud services. This is because during this period, Databricks provided us with an extended evaluation period, and we have not spent much money yet. We are just starting to incur costs this month, I will know more later on the full cost perspective."
"The solution is a good value for batch processing and huge workloads."
report
Use our free recommendation engine to learn which Streaming Analytics solutions are best for your needs.
824,053 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Computer Software Company
18%
Financial Services Firm
17%
University
12%
Manufacturing Company
7%
Financial Services Firm
16%
Computer Software Company
11%
Manufacturing Company
9%
Healthcare Company
6%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
No data available
 

Questions from the Community

What do you like most about Cloudera DataFlow?
The most effective features are data management and analytics.
What is your primary use case for Cloudera DataFlow?
We use Cloudera DataFlow for stream analytics.
Which do you prefer - Databricks or Azure Machine Learning Studio?
Databricks gives you the option of working with several different languages, such as SQL, R, Scala, Apache Spark, or Python. It offers many different cluster choices and excellent integration with ...
How would you compare Databricks vs Amazon SageMaker?
We researched AWS SageMaker, but in the end, we chose Databricks. Databricks is a Unified Analytics Platform designed to accelerate innovation projects. It is based on Spark so it is very fast. It...
Which would you choose - Databricks or Azure Stream Analytics?
Databricks is an easy-to-set-up and versatile tool for data management, analysis, and business analytics. For analytics teams that have to interpret data to further the business goals of their orga...
 

Also Known As

CDF, Hortonworks DataFlow, HDF
Databricks Unified Analytics, Databricks Unified Analytics Platform, Redash
 

Overview

 

Sample Customers

Clearsense
Elsevier, MyFitnessPal, Sharethrough, Automatic Labs, Celtra, Radius Intelligence, Yesware
Find out what your peers are saying about Cloudera DataFlow vs. Databricks and other solutions. Updated: December 2024.
824,053 professionals have used our research since 2012.