Try our new research platform with insights from 80,000+ expert users

Cloudera Data Science Workbench vs Databricks comparison

Sponsored
 

Comparison Buyer's Guide

Executive Summary
 

Categories and Ranking

IBM SPSS Statistics
Sponsored
Ranking in Data Science Platforms
10th
Average Rating
8.0
Number of Reviews
37
Ranking in other categories
Data Mining (3rd)
Cloudera Data Science Workb...
Ranking in Data Science Platforms
21st
Average Rating
7.0
Number of Reviews
2
Ranking in other categories
No ranking in other categories
Databricks
Ranking in Data Science Platforms
1st
Average Rating
8.2
Number of Reviews
82
Ranking in other categories
Streaming Analytics (1st)
 

Mindshare comparison

As of November 2024, in the Data Science Platforms category, the mindshare of IBM SPSS Statistics is 2.8%, up from 2.6% compared to the previous year. The mindshare of Cloudera Data Science Workbench is 1.5%, down from 1.8% compared to the previous year. The mindshare of Databricks is 19.1%, up from 19.1% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Data Science Platforms
 

Featured Reviews

AbakarAhmat - PeerSpot reviewer
Sep 21, 2023
Enhancing survey analysis that provides valued insightfulness
I use it to analyze questionnaire surveys related to a product, solution, or application, such as open data services, which I provide to consumers and end-users. These surveys contain evaluation assessments, and I use SPSS to analyze the responses The most valuable feature is its robust…
Ismail Peer - PeerSpot reviewer
Feb 13, 2024
Useful for data science modeling but improvement is needed in MLOps and pricing
We have different use cases. Our banking use case uses machine learning to identify customer life events and recommend the best-suited card products. These machine-learning models are deployed in our environment, where they run on a scheduled basis. We rely on the platform for every data science…
Dunstan Matekenya - PeerSpot reviewer
Jul 10, 2024
Process large-scale data sets and integrates with Apache Spark with notebook environment
I primarily use Databricks to process large-scale data sets with Apache Spark. My main use case is processing large data sets, such as 600 GB or 800 GB Databricks integrates natively with Apache Spark, which I use as a processing engine for large-scale datasets. This native integration is one of…

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"It offers very good visualization."
"One feature I found very valuable was the analysis of variance (ANOVA)."
"The most valuable feature of IBM SPSS Statistics is all the functionality it provides. Additionally, it is simple to do the five-way analysis that you can into multidimensional setup space. It's the multidimensional space facility that is most useful."
"Capability analysis is one of the main and valuable functions. We also do some hypothesis testing in Minitab and summary stats. These are the functions that we find very useful."
"IBM SPSS Statistics depends on AI."
"It is perfectly adequate if all you need are the results and not the trail of evidence."
"In terms of the features I've found most valuable, I'd say the duration, the correlation, and of course the nonparametric statistics. I use it for reliability and survival analysis, time series, regression models in different solutions, and different types of solutions."
"The most valuable features are the solution is easy to use, training new users is not difficult, and our usage is comprehensive because the whole service is beneficial."
"The Cloudera Data Science Workbench is customizable and easy to use."
"I appreciate CDSW's ability to logically segregate environments, such as data, DR, and production, ensuring they don't interfere with each other. The deployment of machine learning is fast and easy to manage. Its API calls are also fast."
"The built-in optimization recommendations halved the speed of queries and allowed us to reach decision points and deliver insights very quickly."
"The solution is built from Spark and has integration with MLflow, which is important for our use case."
"The most valuable feature of Databricks is the notebook, data factory, and ease of use."
"A very valuable feature is the data processing, and the solution is specifically good at using the Spark ecosystem."
"We can scale the product."
"Databricks is a robust solution for big data processing, offering flexibility and powerful features."
"The integration with Python and the notebooks really helps."
"The most valuable feature of Databricks is the integration with Microsoft Azure."
 

Cons

"SPSS is a tool that's been around since the late 60s, and it's the universal worldwide standard for quantitative social science data analysis. That said, it does seem a bit strange to me that the graphical output functions are so clunky after all these years. The output of charts and graphs that SPSS produces is hideous."
"Improvements are needed in the user interface, particularly in terms of user-friendliness."
"It could provide even more in the way of automation as there are many opportunities."
"Better documentation on how to use macros."
"In some cases, the product takes time to load a large dataset. They could improve this particular area."
"I know that SPSS is a statistical tool but it should also include a little bit of analytical behavior. You can call it augmented analysis or predictive analysis. The bottom line is it should have more graphical and analytical capabilities."
"IBM SPSS Statistics could improve the visual outputs where you are producing, for example, a graph for a company board of directors, or an advert."
"The technical support should be improved."
"Running this solution requires a minimum of 12GB to 16GB of RAM."
"The tool's MLOps is not good. It's pricing also needs to improve."
"I have had some issues with some of the Spark clusters running on Databricks, where the Spark runtime and clusters go up and down, which is an area for improvement."
"I would love an integration in my desktop IDE. For now, I have to code on their webpage."
"Databricks' technical support takes a while to respond and could be improved."
"This solution only supports queries in SQL and Python, which is a bit limiting."
"There would also be benefits if more options were available for workers, or the clusters of the two points."
"There are no direct connectors — they are very limited."
"Implementation of Databricks is still very code heavy."
"Some of the error messages that we receive are too vague, saying things like "unknown exception", and these should be improved to make it easier for developers to debug problems."
 

Pricing and Cost Advice

"The price of this solution is a little bit high, which was a problem for my company."
"The pricing of the modeler is high and can reduce the utility of the product for those who can not afford to adopt it."
"We think that IBM SPSS is expensive for this function."
"It's quite expensive, but they do a special deal for universities."
"I rate the tool's pricing a five out of ten."
"While the pricing of the product may be higher, the accompanying service and features justify the investment."
"More affordable training for new staff members."
"Our licence is on a yearly renewal basis. While pricing is not the primary concern in our evaluation, as products are assessed by whether they can meet our user needs and expertise, the cost can be a limiting factor in the number of licences we procure."
"The product is expensive."
"I do not exactly know the costs, but one of our clients pays between $100 USD and $200 USD monthly."
"I would rate the tool’s pricing an eight out of ten."
"We pay as we go, so there isn't a fixed price. It's charged by the unit. I don't have any details detail about how they measure this, but it should be a mix between processing and quantity of data handled. We run a simulation based on our use cases, which gives us an estimate. We've been monitoring this, and the costs have met our expectations."
"The solution uses a pay-per-use model with an annual subscription fee or package. Typically this solution is used on a cloud platform, such as Azure or AWS, but more people are choosing Azure because the price is more reasonable."
"There are different versions."
"The cost is around $600,000 for 50 users."
"We only pay for the Azure compute behind the solution."
"Whenever we want to find the actual costing, we have to send an email to Databricks, so having the information available on the internet would be helpful."
report
Use our free recommendation engine to learn which Data Science Platforms solutions are best for your needs.
814,649 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
16%
University
10%
Computer Software Company
9%
Manufacturing Company
8%
Financial Services Firm
35%
Manufacturing Company
11%
Healthcare Company
9%
Government
7%
Financial Services Firm
16%
Computer Software Company
12%
Manufacturing Company
9%
Healthcare Company
6%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
No data available
 

Questions from the Community

What do you like most about IBM SPSS Statistics?
The software offers consistency across multiple research projects helping us with predictive analytics capabilities.
What is your experience regarding pricing and costs for IBM SPSS Statistics?
While the pricing of the product may be higher, the accompanying service and features justify the investment. However...
What needs improvement with IBM SPSS Statistics?
In some cases, the product takes time to load a large dataset. They could improve this particular area.
What do you like most about Cloudera Data Science Workbench?
I appreciate CDSW's ability to logically segregate environments, such as data, DR, and production, ensuring they don'...
What needs improvement with Cloudera Data Science Workbench?
The tool's MLOps is not good. It's pricing also needs to improve.
What is your primary use case for Cloudera Data Science Workbench?
We have different use cases. Our banking use case uses machine learning to identify customer life events and recommen...
Which do you prefer - Databricks or Azure Machine Learning Studio?
Databricks gives you the option of working with several different languages, such as SQL, R, Scala, Apache Spark, or ...
How would you compare Databricks vs Amazon SageMaker?
We researched AWS SageMaker, but in the end, we chose Databricks. Databricks is a Unified Analytics Platform designe...
Which would you choose - Databricks or Azure Stream Analytics?
Databricks is an easy-to-set-up and versatile tool for data management, analysis, and business analytics. For analyti...
 

Also Known As

SPSS Statistics
CDSW
Databricks Unified Analytics, Databricks Unified Analytics Platform, Redash
 

Learn More

Video not available
 

Overview

 

Sample Customers

LDB Group, RightShip, Tennessee Highway Patrol, Capgemini Consulting, TEAC Corporation, Ironside, nViso SA, Razorsight, Si.mobil, University Hospitals of Leicester, CROOZ Inc., GFS Fundraising Solutions, Nedbank Ltd., IDS-TILDA
IQVIA, Rush University Medical Center, Western Union
Elsevier, MyFitnessPal, Sharethrough, Automatic Labs, Celtra, Radius Intelligence, Yesware
Find out what your peers are saying about Cloudera Data Science Workbench vs. Databricks and other solutions. Updated: October 2024.
814,649 professionals have used our research since 2012.