Try our new research platform with insights from 80,000+ expert users

Databricks vs H2O.ai comparison

Sponsored
 

Comparison Buyer's Guide

Executive Summary
 

Categories and Ranking

IBM SPSS Statistics
Sponsored
Ranking in Data Science Platforms
10th
Average Rating
8.0
Number of Reviews
37
Ranking in other categories
Data Mining (3rd)
Databricks
Ranking in Data Science Platforms
1st
Average Rating
8.2
Number of Reviews
82
Ranking in other categories
Streaming Analytics (1st)
H2O.ai
Ranking in Data Science Platforms
22nd
Average Rating
7.6
Number of Reviews
7
Ranking in other categories
Model Monitoring (8th)
 

Mindshare comparison

As of November 2024, in the Data Science Platforms category, the mindshare of IBM SPSS Statistics is 2.8%, up from 2.6% compared to the previous year. The mindshare of Databricks is 19.1%, up from 19.1% compared to the previous year. The mindshare of H2O.ai is 1.5%, up from 1.5% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Data Science Platforms
 

Featured Reviews

AbakarAhmat - PeerSpot reviewer
Sep 21, 2023
Enhancing survey analysis that provides valued insightfulness
I use it to analyze questionnaire surveys related to a product, solution, or application, such as open data services, which I provide to consumers and end-users. These surveys contain evaluation assessments, and I use SPSS to analyze the responses The most valuable feature is its robust…
Dunstan Matekenya - PeerSpot reviewer
Jul 10, 2024
Process large-scale data sets and integrates with Apache Spark with notebook environment
I primarily use Databricks to process large-scale data sets with Apache Spark. My main use case is processing large data sets, such as 600 GB or 800 GB Databricks integrates natively with Apache Spark, which I use as a processing engine for large-scale datasets. This native integration is one of…
RK
Dec 11, 2018
It is helpful, intuitive, and easy to use. The learning curve is not too steep.
One example, we are able to automate life insurance. We have to underwrite policies. When somebody applies for a policy, we take their blood, then assign them a risk: substandard, standard, preferred, etc. Depending on this, we price our products. Usually the process is that you take the blood, then it goes to a lab and we get the lab results back, then an underwriter takes a look at the lab results. This is usually done in a two week time frame to get a rating. We were able to build models to automate all of this, and now, it happens in real-time. Somebody can apply online and get issued a policy right away.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"The software offers consistency across multiple research projects helping us with predictive analytics capabilities."
"The best part is that they have an algorithm handbook, so you can open it up and understand how it works, and if it is useful, this is very important."
"The features that I have found most valuable are the Bayesian statistics and descriptive statistics."
"Custom tables and macros: They allow us to create useful reports quickly for a broad audience."
"Since we are using the software as a statistical tool, I would say the best aspects of it are the regression and segmentation capabilities. That said, I've used it for all sorts of things."
"The learning curve to using this product is not steep. The program is appropriate for those who do not have a lot of background in programming, yet have to perform basic statistical analysis."
"It has helped our analyst unit deliver work with more transparency and confidence, given that we can always view the dataset in totality, after each step of data transformation."
"In terms of the features I've found most valuable, I'd say the duration, the correlation, and of course the nonparametric statistics. I use it for reliability and survival analysis, time series, regression models in different solutions, and different types of solutions."
"The main features of the solution are efficiency."
"It can send out large data amounts."
"Databricks is a scalable solution. It is the largest advantage of the solution."
"The simplicity of development is the most valuable feature."
"The fast data loading process and data storage capabilities are great."
"Databricks integrates well with other solutions."
"The solution offers a free community version."
"A very valuable feature is the data processing, and the solution is specifically good at using the Spark ecosystem."
"Fast training, memory-efficient DataFrame manipulation, well-documented, easy-to-use algorithms, ability to integrate with enterprise Java apps (through POJO/MOJO) are the main reasons why we switched from Spark to H2O."
"AutoML helps in hands-free initial evaluations of efficiency/accuracy of ML algorithms."
"The most valuable features are the machine learning tools, the support for Jupyter Notebooks, and the collaboration that allows you to share it across people."
"It is helpful, intuitive, and easy to use. The learning curve is not too steep."
"One of the most interesting features of the product is their driverless component. The driverless component allows you to test several different algorithms along with navigating you through choosing the best algorithm."
"The ease of use in connecting to our cluster machines."
 

Cons

"Improvements are needed in the user interface, particularly in terms of user-friendliness."
"Better documentation on how to use macros."
"I think the visualization and charting should be changed and made easier and more effective."
"IBM SPSS Statistics does not keep you close to your data like KNIME."
"SPSS is a tool that's been around since the late 60s, and it's the universal worldwide standard for quantitative social science data analysis. That said, it does seem a bit strange to me that the graphical output functions are so clunky after all these years. The output of charts and graphs that SPSS produces is hideous."
"The solution needs to improve forecasting using time series analysis."
"It could allow adding color to data models to make them easier to interpret."
"If there is any self-generation data collection plan (DCP), it would be helpful in gathering data. It would also be useful if there is a function to scale it up to, let's say, UiPath and have it consolidate and integrate into a UiPath solution."
"In the future, I would like to see Data Lake support. That is something that I'm looking forward to."
"It would be very helpful if Databricks could integrate with platforms in addition to Azure."
"The interface of Databricks could be easier to use when compared to other solutions. It is not easy for non-data scientists. The user interface is important before we had to write code manually and as solutions move to "No code AI" it is critical that the interface is very good."
"Databricks would benefit from enhanced metrics and tighter integration with Azure's diagnostics."
"CI/CD needs additional leverage and support."
"There would also be benefits if more options were available for workers, or the clusters of the two points."
"Databricks could improve in some of its functionality."
"The tool should improve its integration with other products."
"It lacks the data manipulation capabilities of R and Pandas DataFrames. We would kill for dplyr offloading H2O."
"On the topic of model training and model governance, this solution cannot handle ten or twelve models running at the same time."
"Referring to bullet-3 as well, H2O DataFrame manipulation capabilities are too primitive."
"The interpretability module has room for improvement. Also, it needs to improve its ability to integrate with other systems, like SageMaker, and the overall integration capability."
"I would like to see more features related to deployment."
"It needs a drag and drop GUI like KNIME, for easy access to and visibility of workflows."
"The model management features could be improved."
 

Pricing and Cost Advice

"I rate the tool's pricing a five out of ten."
"The pricing of the modeler is high and can reduce the utility of the product for those who can not afford to adopt it."
"SPSS is an expensive piece of software because it's incredibly complex and has been refined over decades, but I would say it's fairly priced."
"The price of IBM SPSS Statistics could improve."
"It's quite expensive, but they do a special deal for universities."
"If it requires lot of data processing, maybe switching to IBM SPSS Clementine would be better for the buyer."
"The price of this solution is a little bit high, which was a problem for my company."
"While the pricing of the product may be higher, the accompanying service and features justify the investment."
"Databricks are not costly when compared with other solutions' prices."
"The price of Databricks is reasonable compared to other solutions."
"I would rate the tool’s pricing an eight out of ten."
"The basic version of this solution is now open-source, so there are no license costs involved. However, there is a charge for any advanced functionality and this can be quite expensive."
"Databricks uses a price-per-use model, where you can use as much compute as you need."
"I'm not involved in the financing, but I can say that the solution seemed reasonably priced compared to the competitors. Similar products are usually in the same price range. With five being affordable and one being expensive, I would rate Databricks a four out of five."
"I rate the price of Databricks as eight out of ten."
"Whenever we want to find the actual costing, we have to send an email to Databricks, so having the information available on the internet would be helpful."
"We have seen significant ROI where we were able to use the product in certain key projects and could automate a lot of processes. We were even able to reduce staff."
report
Use our free recommendation engine to learn which Data Science Platforms solutions are best for your needs.
814,649 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
16%
University
10%
Computer Software Company
9%
Manufacturing Company
8%
Financial Services Firm
16%
Computer Software Company
12%
Manufacturing Company
9%
Healthcare Company
6%
Financial Services Firm
19%
Computer Software Company
11%
Manufacturing Company
9%
Insurance Company
6%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

What do you like most about IBM SPSS Statistics?
The software offers consistency across multiple research projects helping us with predictive analytics capabilities.
What is your experience regarding pricing and costs for IBM SPSS Statistics?
While the pricing of the product may be higher, the accompanying service and features justify the investment. However...
What needs improvement with IBM SPSS Statistics?
In some cases, the product takes time to load a large dataset. They could improve this particular area.
Which do you prefer - Databricks or Azure Machine Learning Studio?
Databricks gives you the option of working with several different languages, such as SQL, R, Scala, Apache Spark, or ...
How would you compare Databricks vs Amazon SageMaker?
We researched AWS SageMaker, but in the end, we chose Databricks. Databricks is a Unified Analytics Platform designe...
Which would you choose - Databricks or Azure Stream Analytics?
Databricks is an easy-to-set-up and versatile tool for data management, analysis, and business analytics. For analyti...
Ask a question
Earn 20 points
 

Also Known As

SPSS Statistics
Databricks Unified Analytics, Databricks Unified Analytics Platform, Redash
No data available
 

Learn More

Video not available
 

Overview

 

Sample Customers

LDB Group, RightShip, Tennessee Highway Patrol, Capgemini Consulting, TEAC Corporation, Ironside, nViso SA, Razorsight, Si.mobil, University Hospitals of Leicester, CROOZ Inc., GFS Fundraising Solutions, Nedbank Ltd., IDS-TILDA
Elsevier, MyFitnessPal, Sharethrough, Automatic Labs, Celtra, Radius Intelligence, Yesware
poder.io, Stanley Black & Decker, G5, PWC, Comcast, Cisco
Find out what your peers are saying about Databricks vs. H2O.ai and other solutions. Updated: October 2024.
814,649 professionals have used our research since 2012.