Try our new research platform with insights from 80,000+ expert users

Cloudera Data Science Workbench vs Databricks vs KNIME comparison

 

Comparison Buyer's Guide

Executive Summary

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Mindshare comparison

As of April 2025, in the Data Science Platforms category, the mindshare of Cloudera Data Science Workbench is 1.3%, down from 1.7% compared to the previous year. The mindshare of Databricks is 18.2%, down from 19.1% compared to the previous year. The mindshare of KNIME is 11.7%, up from 9.8% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Data Science Platforms
 

Featured Reviews

Ismail Peer - PeerSpot reviewer
Useful for data science modeling but improvement is needed in MLOps and pricing
If you don't configure CDSW well, then it might be not useful for you. Deploying the tool can vary in complexity, but most of the time, it's relatively simple and straightforward. Triggering a job from data to production is easy, as the platform automates the deployment process. However, ensuring optimal resource allocation is essential for smooth operations.
ShubhamSharma7 - PeerSpot reviewer
Capability to integrate diverse coding languages in a single notebook greatly enhances workflow
Databricks offers various courses that I can use, whether it's PySpark, Scala, or R. I can leverage all these courses in a single notebook, which is beneficial for clients as they can access various tools in one place whenever needed. This is quite significant. I usually work with PySpark based on client requirements. After coding, I feed the Databricks notebooks into the ADF pipeline for updates. Databricks' capability to process data in parallel enhances data processing speed. Furthermore, I can connect our Databricks notebook directly with Power BI and other visualization tools like Qlik. Once we develop code, it allows us to transform raw data into visualizations for clients using analysis diagrams, which is very helpful.
Laurence Moseley - PeerSpot reviewer
Has a drag-and-drop interface and AI capabilities
It's difficult to pinpoint one single feature because KNIME has so many. For starters, it's very easy to learn. You can get started with just a one-hour video. The drag-and-drop interface makes it user-friendly. For example, if you want to read an Excel file, drag the "read Excel file" node from the repository, configure it by specifying the file location, and run it. This gives you a table with all your data. Next, you can clean the data by handling missing values, selecting specific columns you want to analyze, and then proceeding with your analysis, such as regression or correlation. KNIME has over 4,500 nodes available for download. In addition, KNIME offers various extensions. For instance, the text processing extension allows you to process text data efficiently. It's more powerful than other tools like NVivo and Palantir. KNIME also has AI capabilities. If you're unsure about the next step, the AI assistant can suggest the most frequently used nodes based on your previous work. Another valuable feature is the integration with Python. If you need to perform a task that requires Python, you can simply add a Python node, write the necessary code,

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"The Cloudera Data Science Workbench is customizable and easy to use."
"I appreciate CDSW's ability to logically segregate environments, such as data, DR, and production, ensuring they don't interfere with each other. The deployment of machine learning is fast and easy to manage. Its API calls are also fast."
"The most valuable feature of Databricks is the integration of the data warehouse and data lake, and the development of the lake house. Additionally, it integrates well with Spark for processing data in production."
"A very valuable feature is the data processing, and the solution is specifically good at using the Spark ecosystem."
"The notebooks and the ability to share them with collaborators are valuable, as multiple developers can use a single cluster."
"The tool helps with data processing and analytics with large-scale data or big data since it is associated with managing data at a large scale."
"Databricks is hosted on the cloud. It is very easy to collaborate with other team members who are working on it. It is production-ready code, and scheduling the jobs is easy."
"The most valuable feature is the ability to use SQL directly with Databricks."
"I like cloud scalability and data access for any type of user."
"It can send out large data amounts."
"KNIME is more intuitive and easier to use, which is the principal advantage."
"It's a very powerful and simple tool to use."
"KNIME is fast and the visualization provides a lot of clarity. It clarifies your thinking because you can see what's going on with your data."
"The product is very easy to understand even for non-analytical stakeholders. Sometimes we provide them with KNIME workflows and teach them how to run it on their own machine."
"Clear view of the data at every step of ETL process enables changing the flow as needed."
"It has allowed us to easily implement advanced analytics into various processes."
"We are able to automate several functions which were done manually. I can integrate several data sets quickly and easily, to support analytics."
"The ETL which helps me to collect, reformat, and load the data from multiple sources into one destination, a storage database."
 

Cons

"The tool's MLOps is not good. It's pricing also needs to improve."
"Running this solution requires a minimum of 12GB to 16GB of RAM."
"As a data engineer, I see cluster failure in our Databricks user databases as a major issue."
"The product could be improved regarding the delay when switching to higher-performing virtual machines compared to other platforms."
"Databricks would benefit from enhanced metrics and tighter integration with Azure's diagnostics."
"The solution has some scalability and integration limitations when consolidating legacy systems."
"The product could be improved by offering an expansion of their visualization capabilities, which currently assists in development in their notebook environment."
"The API deployment and model deployment are not easy on the Databricks side."
"They release patches that sometimes break our code. These patches are supposed to fix issues, but sometimes they cause disruptions."
"A lot of people are required to manage this solution."
"There should be better documentation and the steps should be easier."
"The license is quite expensive for us."
"It's pretty straightforward to understand. So, if you understand what the pipeline is, you can use the drag-and-drop functionality without much training. Doing the same thing in Python requires so much more training. That's why I use KNIME."
"The documentation is lacking and it could be better."
"It needs more examples, use cases, and MOOC to learn, especially with respect to the algorithms and how to practically create a flow from end-to-end."
"Both RapidMiner and KNIME should be made easier to use in the field of deep learning."
"In my environment, I need to access a lot of servers with different characteristics and access methods. Some of my servers have to be accessed using proxy which is not supported by KNIME, so I still need to create the middleware to supply the source of my KNIME configurations."
"KNIME could improve when it comes to large data markets."
 

Pricing and Cost Advice

"The product is expensive."
"We pay as we go, so there isn't a fixed price. It's charged by the unit. I don't have any details detail about how they measure this, but it should be a mix between processing and quantity of data handled. We run a simulation based on our use cases, which gives us an estimate. We've been monitoring this, and the costs have met our expectations."
"The solution requires a subscription."
"We implement this solution on behalf of our customers who have their own Azure subscription and they pay for Databricks themselves. The pricing is more expensive if you have large volumes of data."
"Databricks are not costly when compared with other solutions' prices."
"The product pricing is moderate."
"We find Databricks to be very expensive, although this improved when we found out how to shut it down at night."
"The cost for Databricks depends on the use case. I work on it as a consultant, so I'm using the client's Databricks, so it depends on how big the client is."
"The billing of Databricks can be difficult and should improve."
"At this time, I am using the free version of Knime."
"We're using the free academic license just locally. I went for KNIME because they have a free academic license."
"I use the open-source version."
"KNIME is an open-source tool, so it's free to use."
"KNIME offers a free version"
"The price for Knime is okay."
"For beginners, the free desktop version is very attractive, but the full server version can be more expensive. I have only used the free version and it offers a fair pricing system. I have been promoting it to others without any compensation or request from the company, simply because I am enthusiastic about it. I am not aware of the pricing for the server version, but it seems to be widely used."
"KNIME is free and open source."
report
Use our free recommendation engine to learn which Data Science Platforms solutions are best for your needs.
849,335 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
34%
Manufacturing Company
10%
Healthcare Company
8%
Computer Software Company
8%
Financial Services Firm
18%
Computer Software Company
10%
Manufacturing Company
9%
Healthcare Company
6%
Financial Services Firm
12%
Manufacturing Company
11%
Computer Software Company
9%
University
8%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
No data available
 

Questions from the Community

What do you like most about Cloudera Data Science Workbench?
I appreciate CDSW's ability to logically segregate environments, such as data, DR, and production, ensuring they don'...
What needs improvement with Cloudera Data Science Workbench?
The tool's MLOps is not good. It's pricing also needs to improve.
What is your primary use case for Cloudera Data Science Workbench?
We have different use cases. Our banking use case uses machine learning to identify customer life events and recommen...
Which do you prefer - Databricks or Azure Machine Learning Studio?
Databricks gives you the option of working with several different languages, such as SQL, R, Scala, Apache Spark, or ...
How would you compare Databricks vs Amazon SageMaker?
We researched AWS SageMaker, but in the end, we chose Databricks. Databricks is a Unified Analytics Platform designe...
Which would you choose - Databricks or Azure Stream Analytics?
Databricks is an easy-to-set-up and versatile tool for data management, analysis, and business analytics. For analyti...
What do you like most about KNIME?
Since KNIME is a no-code platform, it is easy to work with.
What is your experience regarding pricing and costs for KNIME?
I rate the product’s pricing a seven out of ten, where one is cheap and ten is expensive.
What needs improvement with KNIME?
I have seen the potential to interact with Python, which is currently a bit limited. I am interested in the newer ver...
 

Also Known As

CDSW
Databricks Unified Analytics, Databricks Unified Analytics Platform, Redash
KNIME Analytics Platform
 

Overview

 

Sample Customers

IQVIA, Rush University Medical Center, Western Union
Elsevier, MyFitnessPal, Sharethrough, Automatic Labs, Celtra, Radius Intelligence, Yesware
Infocom Corporation, Dymatrix Consulting Group, Soluzione Informatiche, MMI Agency, Estanislao Training and Solutions, Vialis AG
Find out what your peers are saying about Databricks, Knime, Amazon Web Services (AWS) and others in Data Science Platforms. Updated: March 2025.
849,335 professionals have used our research since 2012.