Try our new research platform with insights from 80,000+ expert users

Databricks vs H2O.ai comparison

 

Comparison Buyer's Guide

Executive SummaryUpdated on Dec 5, 2024

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Categories and Ranking

Databricks
Ranking in Data Science Platforms
1st
Average Rating
8.2
Reviews Sentiment
7.0
Number of Reviews
85
Ranking in other categories
Streaming Analytics (1st)
H2O.ai
Ranking in Data Science Platforms
20th
Average Rating
7.6
Reviews Sentiment
7.2
Number of Reviews
8
Ranking in other categories
Model Monitoring (6th)
 

Mindshare comparison

As of January 2025, in the Data Science Platforms category, the mindshare of Databricks is 19.1%, up from 18.5% compared to the previous year. The mindshare of H2O.ai is 1.5%, down from 1.6% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Data Science Platforms
 

Featured Reviews

Parag Bhosale - PeerSpot reviewer
Integrating engineering and learning, but cost challenges arise with cluster management
We often use a single cluster to ingest Databricks, which Databricks doesn't recommend. They suggest using a no-cluster solution like job clusters. This can be overwhelming for us because we started smaller. We prefer using a small to mid-sized cluster for many jobs to keep costs low, but this sometimes doesn't support our operations properly. We need to stay in sync with the DVR versions, and migrations can pose challenges. For example, issues arose when we moved a cluster from a previous version to the latest one. We could use their job clusters, however, that increases costs, which is challenging for us as a startup. Maintaining this infrastructure can be a headache.
Kashif Yaseen - PeerSpot reviewer
Plug-and-play convenience enhances productivity but needs better multimodal support
We mostly used the solution in the domain that I'm working. We had most of the use cases around chatbots and conversational BI The solution was plug-and-play, meaning most of the components were handled by the solution itself rather than building them from scratch. This was useful for our banking…

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"The solution offers a free community version."
"The most valuable features of the solution are the hardware and the resources it quickly provides without much hassle."
"A very valuable feature is the data processing, and the solution is specifically good at using the Spark ecosystem."
"Databricks gives you the flexibility of using several programming languages independently or in combination to build models."
"I like the ability to use workspaces with other colleagues because you can work together even without seeing the other team's job."
"We can scale the product."
"Databricks' Lakehouse architecture has been most useful for us. The data governance has been absolutely efficient in between other kinds of solutions."
"The Delta Lake data type has been the most useful part of this solution. Delta Lake is an opensource data type and it was implemented and invented by Databricks."
"Fast training, memory-efficient DataFrame manipulation, well-documented, easy-to-use algorithms, ability to integrate with enterprise Java apps (through POJO/MOJO) are the main reasons why we switched from Spark to H2O."
"It is helpful, intuitive, and easy to use. The learning curve is not too steep."
"AutoML helps in hands-free initial evaluations of efficiency/accuracy of ML algorithms."
"One of the most interesting features of the product is their driverless component. The driverless component allows you to test several different algorithms along with navigating you through choosing the best algorithm."
"The most valuable feature of H2O.ai is that it is plug-and-play."
"The ease of use in connecting to our cluster machines."
"The most valuable features are the machine learning tools, the support for Jupyter Notebooks, and the collaboration that allows you to share it across people."
 

Cons

"Databricks is not geared towards the end-user, but rather it is for data engineers or data scientists."
"There should be better integration with other platforms."
"I would like more integration with SQL for using data in different workspaces."
"Databricks requires writing code in Python or SQL, so if you're a good programmer then you can use Databricks."
"The ability to customize our own pipelines would enhance the product, similar to what's possible using ML files in Microsoft Azure DevOps."
"They release patches that sometimes break our code. These patches are supposed to fix issues, but sometimes they cause disruptions."
"Some of the error messages that we receive are too vague, saying things like "unknown exception", and these should be improved to make it easier for developers to debug problems."
"Databricks may not be as easy to use as other tools, but if you simplify a tool too much, it won't have the flexibility to go in-depth. Databricks is completely in the programmer's hands. I prefer flexibility rather than simplicity."
"It lacks the data manipulation capabilities of R and Pandas DataFrames. We would kill for dplyr offloading H2O."
"I would like to see more features related to deployment."
"H2O.ai can improve in areas like multimodal support and prompt engineering."
"The model management features could be improved."
"The interpretability module has room for improvement. Also, it needs to improve its ability to integrate with other systems, like SageMaker, and the overall integration capability."
"On the topic of model training and model governance, this solution cannot handle ten or twelve models running at the same time."
"Referring to bullet-3 as well, H2O DataFrame manipulation capabilities are too primitive."
"It needs a drag and drop GUI like KNIME, for easy access to and visibility of workflows."
 

Pricing and Cost Advice

"The price of Databricks is reasonable compared to other solutions."
"The licensing costs of Databricks is a tiered licensing regime, so it is flexible."
"I do not exactly know the costs, but one of our clients pays between $100 USD and $200 USD monthly."
"We only pay for the Azure compute behind the solution."
"We have only incurred the cost of our AWS cloud services. This is because during this period, Databricks provided us with an extended evaluation period, and we have not spent much money yet. We are just starting to incur costs this month, I will know more later on the full cost perspective."
"We're charged on what the data throughput is and also what the compute time is."
"We find Databricks to be very expensive, although this improved when we found out how to shut it down at night."
"The solution requires a subscription."
"We have seen significant ROI where we were able to use the product in certain key projects and could automate a lot of processes. We were even able to reduce staff."
report
Use our free recommendation engine to learn which Data Science Platforms solutions are best for your needs.
831,158 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
17%
Computer Software Company
11%
Manufacturing Company
9%
Healthcare Company
6%
Financial Services Firm
21%
Computer Software Company
11%
Manufacturing Company
10%
Energy/Utilities Company
6%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

Which do you prefer - Databricks or Azure Machine Learning Studio?
Databricks gives you the option of working with several different languages, such as SQL, R, Scala, Apache Spark, or Python. It offers many different cluster choices and excellent integration with ...
How would you compare Databricks vs Amazon SageMaker?
We researched AWS SageMaker, but in the end, we chose Databricks. Databricks is a Unified Analytics Platform designed to accelerate innovation projects. It is based on Spark so it is very fast. It...
Which would you choose - Databricks or Azure Stream Analytics?
Databricks is an easy-to-set-up and versatile tool for data management, analysis, and business analytics. For analytics teams that have to interpret data to further the business goals of their orga...
What needs improvement with H2O.ai?
H2O.ai can improve in areas like multimodal support and prompt engineering. They are already working on updates and changes. Although I haven't explored all the new products they've added to their ...
What is your primary use case for H2O.ai?
We mostly used the solution in the domain that I'm working. We had most of the use cases around chatbots and conversational BI.
What advice do you have for others considering H2O.ai?
It is important to address data privacy concerns and ensure you're choosing the right vendor that meets your use case demands. Also, you may leave my name, Kashif, but please keep the company name ...
 

Comparisons

 

Also Known As

Databricks Unified Analytics, Databricks Unified Analytics Platform, Redash
No data available
 

Learn More

 

Overview

 

Sample Customers

Elsevier, MyFitnessPal, Sharethrough, Automatic Labs, Celtra, Radius Intelligence, Yesware
poder.io, Stanley Black & Decker, G5, PWC, Comcast, Cisco
Find out what your peers are saying about Databricks vs. H2O.ai and other solutions. Updated: January 2025.
831,158 professionals have used our research since 2012.