Try our new research platform with insights from 80,000+ expert users

Cloudera Data Science Workbench vs Dataiku comparison

Sponsored
 

Comparison Buyer's Guide

Executive Summary
 

Categories and Ranking

IBM SPSS Statistics
Sponsored
Ranking in Data Science Platforms
10th
Average Rating
8.0
Number of Reviews
37
Ranking in other categories
Data Mining (3rd)
Cloudera Data Science Workb...
Ranking in Data Science Platforms
21st
Average Rating
7.0
Number of Reviews
2
Ranking in other categories
No ranking in other categories
Dataiku
Ranking in Data Science Platforms
7th
Average Rating
8.0
Number of Reviews
8
Ranking in other categories
No ranking in other categories
 

Mindshare comparison

As of November 2024, in the Data Science Platforms category, the mindshare of IBM SPSS Statistics is 2.8%, up from 2.6% compared to the previous year. The mindshare of Cloudera Data Science Workbench is 1.5%, down from 1.8% compared to the previous year. The mindshare of Dataiku is 11.5%, up from 7.5% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Data Science Platforms
 

Featured Reviews

AbakarAhmat - PeerSpot reviewer
Enhancing survey analysis that provides valued insightfulness
I used traditional tools where I would prepare data, click through menus, and use SQL Server for data visualization. We switched to IBM SPSS because it offers strong certification and aligns well with the standards we prioritize in our surveys. In terms of popularity, it stands out as the top choice in the market, especially in the research and university domains. Many different organizations and institutions use SPSS for statistical analytics. While there are other tools like MCLab and similar options available, SPSS is the most renowned and widely used among them.
Ismail Peer - PeerSpot reviewer
Useful for data science modeling but improvement is needed in MLOps and pricing
If you don't configure CDSW well, then it might be not useful for you. Deploying the tool can vary in complexity, but most of the time, it's relatively simple and straightforward. Triggering a job from data to production is easy, as the platform automates the deployment process. However, ensuring optimal resource allocation is essential for smooth operations.
Sabrine Bendimerad - PeerSpot reviewer
Saves a lot of time because I can quickly handle all the data preparation tasks and concentrate on building my machine learning algorithms
One of the main challenges was collaboration. Developers typically use GitHub to push and manage code, but integrating GitHub with Dataiku was complicated. While it was theoretically possible to use GitHub with Dataiku, in practice, it was difficult to manage our code effectively and push it from Dataiku to GitHub. Another limitation was its ability to handle different types of data. While Dataiku is powerful for working with structured data, like regular or geospatial data, it struggled with more complex data types such as text and image. In addition to the challenges with GitHub integration, the limited support for diverse data types was another feature lacking at that time.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"You can find a complete algorithm in the solution and use it. You don't need to write your own algorithms for predictive analytics. That's the most valuable feature and the main one we use."
"SPSS can handle whatever you throw at it, whether your data set contains 10,000, 100,000, or a million objects. It's like the heavy artillery of analytical tools."
"The most valuable feature is the user interface because you don't need to write code."
"They have many existing algorithms that we can use and use effectively to analyze and understand how to put our data to work to improve what we do."
"I've found the descriptive statistics and cross-tabs valuable. The very simple correlations and regressions are as well."
"The software offers consistency across multiple research projects helping us with predictive analytics capabilities."
"The SPSS interface is very accessible and user-friendly. It's really easy to get information in it. I've shared it with experts and beginners, and everyone can navigate it."
"in terms of the simplicity, I think the SPSS basic can handle it."
"The Cloudera Data Science Workbench is customizable and easy to use."
"I appreciate CDSW's ability to logically segregate environments, such as data, DR, and production, ensuring they don't interfere with each other. The deployment of machine learning is fast and easy to manage. Its API calls are also fast."
"The most valuable feature is the set of visual data preparation tools."
"Extremely easy to use with its GUI-based functionality and large compatibility with various data sources. Also, maintenance processes are much more automated than ever, with fewer errors."
"The solution is quite stable."
"The most valuable feature of this solution is that it is one tool that can do everything, and you have the ability to very easily push your design to prediction."
"I like the interface, which is probably my favorite part of the solution. It is really user-friendly for an IT person."
"Data Science Studio's data science model is very useful."
"Cloud-based process run helps in not keeping the systems on while processes are running."
"The advantage is that you can focus on machine learning while having access to what they call 'recipes.' These recipes allow me to preprocess and prepare data without writing any code."
 

Cons

"The solution needs to improve forecasting using time series analysis."
"It would be helpful if there was better documentation on how to properly use the solution. A beginner's guide on how to use the various programming functions within the product would be so useful to a lot of people. I found that everything was very confusing at first. Having clear documentation would help alleviate that."
"SPSS slows down the computer or the laptop if the data is huge; then you need a faster computer."
"There is a learning curve; it's not very steep, but there is one."
"In some cases, the product takes time to load a large dataset. They could improve this particular area."
"The solution needs more planning tools and capabilities."
"I know that SPSS is a statistical tool but it should also include a little bit of analytical behavior. You can call it augmented analysis or predictive analysis. The bottom line is it should have more graphical and analytical capabilities."
"One of the areas that should be similar to Minitabs is the use of blogs. The Minitabs blog helps users understand the tools and gives lots of practical examples. Following the SPSS manual is cumbersome. It's a good, exhaustive manual, but it's not practical to use. With Minitabs, you can go to the blogs and find specific articles written about various components and it's very helpful. Without blogs, we find SPSS more complicated."
"The tool's MLOps is not good. It's pricing also needs to improve."
"Running this solution requires a minimum of 12GB to 16GB of RAM."
"In the next release of this solution, I would like to see deep learning better integrated into the tool and not simply an extension or plugin."
"I think it would help if Data Science Studio added some more features and improved the data model."
"Dataiku still needs some coding, and that could be a difference where business data scientists would go for DataRobot more than Dataiku."
"Server up-time needs to be improved. Also, query engines like Spark and Hive need to be more stable."
"The interface for the web app can be a bit difficult. It needs to have better capabilities, at least for developers who like to code. This is due to the fact that everything is enabled in a single window with different tabs. For them to actually develop and do the concurrent testing that needs to be done, it takes a bit of time. That is one improvement that I would like to see - from a web app developer perspective."
"I find that it is a little slow during use. It takes more time than I would expect for operations to complete."
"The ability to have charts right from the explorer would be an improvement."
"One of the main challenges was collaboration. Developers typically use GitHub to push and manage code, but integrating GitHub with Dataiku was complicated."
 

Pricing and Cost Advice

"While the pricing of the product may be higher, the accompanying service and features justify the investment."
"Our licence is on a yearly renewal basis. While pricing is not the primary concern in our evaluation, as products are assessed by whether they can meet our user needs and expertise, the cost can be a limiting factor in the number of licences we procure."
"I rate the tool's pricing a five out of ten."
"The pricing of the modeler is high and can reduce the utility of the product for those who can not afford to adopt it."
"We think that IBM SPSS is expensive for this function."
"It's quite expensive, but they do a special deal for universities."
"If it requires lot of data processing, maybe switching to IBM SPSS Clementine would be better for the buyer."
"The price of IBM SPSS Statistics could improve."
"The product is expensive."
"The annual licensing fees are approximately €20 ($22 USD) per key for the basic version and €40 ($44 USD) per key for the version with everything."
"Pricing is pretty steep. Dataiku is also not that cheap."
report
Use our free recommendation engine to learn which Data Science Platforms solutions are best for your needs.
816,933 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
17%
Computer Software Company
9%
University
9%
Manufacturing Company
8%
Financial Services Firm
35%
Manufacturing Company
11%
Healthcare Company
9%
Government
7%
Financial Services Firm
18%
Educational Organization
16%
Manufacturing Company
9%
Computer Software Company
8%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
No data available
 

Questions from the Community

What do you like most about IBM SPSS Statistics?
The software offers consistency across multiple research projects helping us with predictive analytics capabilities.
What is your experience regarding pricing and costs for IBM SPSS Statistics?
The cost of IBM SPSS Statistics is managed by organizations, not individual researchers. It is a very expensive produ...
What needs improvement with IBM SPSS Statistics?
IBM SPSS Statistics does not keep you close to your data like KNIME. In KNIME, at every stage, you can see the result...
What do you like most about Cloudera Data Science Workbench?
I appreciate CDSW's ability to logically segregate environments, such as data, DR, and production, ensuring they don'...
What needs improvement with Cloudera Data Science Workbench?
The tool's MLOps is not good. It's pricing also needs to improve.
What is your primary use case for Cloudera Data Science Workbench?
We have different use cases. Our banking use case uses machine learning to identify customer life events and recommen...
What needs improvement with Dataiku Data Science Studio?
One of the main challenges was collaboration. Developers typically use GitHub to push and manage code, but integratin...
What is your primary use case for Dataiku Data Science Studio?
We use the solution for data science and machine learning.
 

Also Known As

SPSS Statistics
CDSW
Dataiku DSS
 

Learn More

Video not available
 

Overview

 

Sample Customers

LDB Group, RightShip, Tennessee Highway Patrol, Capgemini Consulting, TEAC Corporation, Ironside, nViso SA, Razorsight, Si.mobil, University Hospitals of Leicester, CROOZ Inc., GFS Fundraising Solutions, Nedbank Ltd., IDS-TILDA
IQVIA, Rush University Medical Center, Western Union
BGL BNP Paribas, Dentsu Aegis, Link Mobility Group, AramisAuto
Find out what your peers are saying about Cloudera Data Science Workbench vs. Dataiku and other solutions. Updated: October 2024.
816,933 professionals have used our research since 2012.