We use this solution for the Customer Data Platform(CDP). My company works in the MarTech space and usually we implement custom CDP.
Head of Business Integration and Architecture at Jakala
Highly scalable data platform that offers exceptional performance and value data types unique to this solution
Pros and Cons
- "The Delta Lake data type has been the most useful part of this solution. Delta Lake is an opensource data type and it was implemented and invented by Databricks."
- "The data visualization for this solution could be improved. They have started to roll out a data visualization tool inside Databricks but it is in the early stages. It's not comparable to a solution like Power BI, Luca, or Tableau."
What is our primary use case?
What is most valuable?
The Delta Lake data type has been the most useful part of this solution. Delta Lake is an opensource data type and it was implemented and invented by Databricks. It is the most important element of the solution. Databricks also offers exceptional performance and scalability.
What needs improvement?
The data visualization for this solution could be improved. They have started to roll out a data visualization tool inside Databricks but it is in the early stages. It's not comparable to a solution like Power BI, Luca, or Tableau.
In a future release, we would like to have a better ETL designer tool to assist in the way we move data from one place to another.
For how long have I used the solution?
We have been using this solution for four years.
Buyer's Guide
Databricks
February 2025

Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: February 2025.
838,713 professionals have used our research since 2012.
What do I think about the stability of the solution?
This is a stable solution.
What do I think about the scalability of the solution?
This is a scalable solution.
How was the initial setup?
The initial setup is very easy. It is a managed solution inside Azure so you just need to search for Databricks. There are a couple of pages to follow in the setup wizard and Databricks is up and running.
What's my experience with pricing, setup cost, and licensing?
We implement this solution on behalf of our customers who have their own Azure subscription and they pay for Databricks themselves. The pricing is more expensive if you have large volumes of data.
Which other solutions did I evaluate?
When we first started using Databricks in 2018, there were not many comarable solutions to consider. Right now there are many solutions to consider including Snowflake, Azure Synapse, Redshift and BigQuery.
Databricks continues to be our solution of choice but Snowflake does have a better user interface and is easier to work with the data pipelines and with the overall UI.
What other advice do I have?
I would advise others to first define a strong data strategy and then choose which data platform suits your needs.
I would rate this solution a nine out of ten.
Disclosure: I am a real user, and this review is based on my own experience and opinions.

STI Data Leader at grupo gtd
Easy to use with a free community version and helpful documentation
Pros and Cons
- "The solution offers a free community version."
- "We'd like a more visual dashboard for analysis It needs better UI."
What is most valuable?
I like the simplicity and ease of use.
You can deploy the solution to many clouds easily.
The initial setup is straightforward.
The solution offers a free community version.
What needs improvement?
The auto models can be improved.
We can create auto models like Microsoft Azure Machine Learning. In Azure Machine Learning, they have these features, for example, for auto models or code, or by code. They need this in Databricks.
We need more connectors between on-premises and the cloud.
We'd like a more visual dashboard for analysis It needs better UI.
For how long have I used the solution?
I've used the solution for one and a half months.
What do I think about the stability of the solution?
The solution is very stable. There are no bugs or glitches. It doesn't crash or freeze.
What do I think about the scalability of the solution?
Scalability is no problem. At the beginning, we created a cluster, for example, and if we need more performance in the future, for example, or to accelerate the training, we can change the cluster. It's quite straightforward.
We have five people using the solution.
In one or two years, we'd like to promote the solution to clients and increase usage. Right now, the way it is used is limited. I know that some banks and aeronautics companies use it.
How are customer service and support?
In terms of technical support, for now, we use the community.
Which solution did I use previously and why did I switch?
We are also aware of KNIME, Azure Machine Learning, and Anaconda. In Anaconda, we use many frameworks, for example.
We started with other platforms, like Azure Machine Learning due to the fact that, with AutoML, it's easy to use. However, now that we have more skills, we need other tools or platforms like Databricks. It's a good platform to deploy and develop machine learning in employees.
How was the initial setup?
The implementation is quite easy. It's not complex or difficult. The first time, I did it using a tutorial which was quite helpful. Later, I took a course. I know it quite well.
The deployment only takes a few days.
You only need to deploy or maintain the solution.
What about the implementation team?
We did not need any outside assistance in terms of setting up the solution.
What's my experience with pricing, setup cost, and licensing?
For us, this product is free. We use the community version.
I am interested in using the enterprise version, however. Whether we use it or not depends on the projects and customers we get.
What other advice do I have?
I work with a solution provider. We are a Databrick customer.
We are not partners of Databricks. Only we are partnered with Microsoft Azure and Amazon AWS.
We are using the latest version of the solution. However, I do not know the exact version number.
I still need time with the solution before providing advice to others. I need to prepare the capacity internally. So far, it's been great.
I'd rate the solution eight out of ten.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Buyer's Guide
Databricks
February 2025

Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: February 2025.
838,713 professionals have used our research since 2012.
Strategic Alliances& Ecosystems Manager at a outsourcing company with 501-1,000 employees
Helps to have a good data presence but needs to incorporate learning aspects
Pros and Cons
- "Databricks has helped us have a good presence in data."
- "The product should incorporate more learning aspects. It needs to have a free trial version that the team can practice."
What is our primary use case?
The product has helped in data fabrication.
How has it helped my organization?
Databricks has helped us have a good presence in data.
What needs improvement?
The product should incorporate more learning aspects. It needs to have a free trial version that the team can practice.
For how long have I used the solution?
I have been using the product for more than six months.
What do I think about the stability of the solution?
I rate Databricks' an eight out of ten.
What do I think about the scalability of the solution?
I rate the tool's scalability an eight out of ten.
How was the initial setup?
The transition to Databricks was smooth.
What's my experience with pricing, setup cost, and licensing?
Databricks' price is high.
What other advice do I have?
I rate the solution a nine out of ten.
Disclosure: My company has a business relationship with this vendor other than being a customer:
Senior Software Engineer at a computer software company with 201-500 employees
Valuable data analysis and engineering features with an easy setup
Pros and Cons
- "The setup is quite easy."
- "Can be improved by including drag-and-drop features."
What is our primary use case?
Our primary use case for the solution is data analysis by providing a Spark cluster environment with a driver to analyze a huge amount of data and gigabytes of data and can create Notebooks in Databricks. We can write SQL commands, Python code, Scala, or Spark with Python. With Databricks, we get a cluster hosted in the public cloud and we adjust it based on how much we use it.
What is most valuable?
The most valuable features are data engineering and data science because we can create Notebooks on them. We can use any Python library to build data science models, or we can use libraries like Seaborn or Matplotlib to create charts based on data for data analysis. It is a really valuable capability.
What needs improvement?
Microsoft Azure has its learning environment on the Microsoft website. We can complete certifications, but the Databricks certification is more expensive than Microsoft. It costs between $2,000 and $2,500, and the knowledge is linked. They're also charged based on whether a person doesn't want to analyze large amounts of data. Hence, we want to have the capacity for free student users so that people can learn and build their professional skills.
For how long have I used the solution?
We have been using the solution for approximately one year.
What do I think about the stability of the solution?
The solution is stable. Microsoft offers a public service, and we can get it from the Databricks website. Additionally, many companies use it to analyze their data or create a Spark cluster to run Python or SQL scripts based on their data. I rate the stability a nine out of ten.
How was the initial setup?
The setup is quite easy, and Databricks has also partnered with Microsoft, so we get this service on Microsoft Azure.
What was our ROI?
We have seen a return on investment.
What's my experience with pricing, setup cost, and licensing?
We have a pay-as-you-go subscription and pay for it based on our usage.
Which other solutions did I evaluate?
We chose this solution because my company uses Microsoft Azure for a project, and my role as a data engineer primarily focuses on data-related services. For storing data, we use Data Lake; similarly, for the data processing engine, we use Spark, which Databricks provides.
What other advice do I have?
I rate the solution an eight out of ten. The solution is good but can be improved by including drag-and-drop features because it can be helpful for users who are unfamiliar with coding. I advise new users to have prior experience with Python or SQL before utilizing this solution if they use it for data science or model building.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Helps users with data processing and analytics
Pros and Cons
- "The tool helps with data processing and analytics with large-scale data or big data since it is associated with managing data at a large scale."
- "The biggest problem associated with the product is that it is quite pricey."
What is our primary use case?
I use Databricks to manage the setting up of data lakes for SaaS.
What needs improvement?
The biggest problem associated with the product is that it is quite pricey. We cannot find a better solution than Databricks in the market currently.
For how long have I used the solution?
I have been using Databricks for a year.
What's my experience with pricing, setup cost, and licensing?
It is an expensive tool. The licensing model is a pay-as-you-go one.
What other advice do I have?
The tool helps with data processing and analytics with large-scale data or big data since it is associated with managing data at a large scale.
For my general use cases, I would say that I am not a technical person, so I cannot explain to you how the tool helps with the area of data engineering tasks.
There is another team in my company that is involved in the use of machine learning and AI features in Databricks. My team is mostly into operations. The tool is used in a multi-country project.
For example, in my company, they make some shopping decisions related to solutions based on what is the product chosen by the whole company.
I rate the tool an eight out of ten.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Last updated: Jul 17, 2024
Flag as inappropriateAssociate Principal - Data Engineering at a tech services company with 10,001+ employees
It's a unified platform that lets you do streaming and batch processing in the same place
Pros and Cons
- "I like that Databricks is a unified platform that lets you do streaming and batch processing in the same place. You can do analytics, too. They have added something called Databricks SQL Analytics, allowing users to connect to the data lake to perform analytics. Databricks also will enable you to share your data securely. It integrates with your reporting system as well."
- "Databricks may not be as easy to use as other tools, but if you simplify a tool too much, it won't have the flexibility to go in-depth. Databricks is completely in the programmer's hands. I prefer flexibility rather than simplicity."
What is our primary use case?
We build data solutions for the banking industry. Previously, we worked with AWS, but now we are on Azure. My role is to assess the current legacy applications and provide cloud alternatives based on the customers' requirements and expectations.
Databricks is a unified platform that provides features like streaming and batch processing. All the data scientists, analysts, and engineers can collaborate on a single platform. It has all the features, you need, so you don't need to go for any other tool.
What is most valuable?
I like that Databricks is a unified platform that lets you do streaming and batch processing in the same place. You can do analytics, too. They have added something called Databricks SQL Analytics, allowing users to connect to the data lake to perform analytics. Databricks also will enable you to share your data securely. It integrates with your reporting system as well.
The Unity Catalog provides you with the data links and material capabilities. These are some of the unique features that fulfill all the requirements of the banking domain.
What needs improvement?
Every tool has room for improvement. Normally what happens, a solution will claim it can do ETL and everything else, but you encounter some limitations when you actually start. Then you keep on interacting with the vendor, and they continue to upgrade it. For example, we haven't fully implemented Databricks Unity Catalog, a newly introduced feature. We need to check how it works and then accordingly, there can be improvements in that also.
Databricks may not be as easy to use as other tools, but if you simplify a tool too much, it won't have the flexibility to go in-depth. Databricks is completely in the programmer's hands. I prefer flexibility rather than simplicity.
For how long have I used the solution?
I have been using Databricks for a year.
What do I think about the scalability of the solution?
Databricks relies on scalability and performance. Every cloud vendor prioritizes scalability, high availability, performance, and security. These are the most important reasons to move to the cloud.
How was the initial setup?
Deploying Databricks on the cloud is straightforward. It's not like an on-premise solution, where you must create a cluster and all those other prerequisites for big data.
I don't think it's challenging to maintain, but you need an expert programmer because Databricks isn't GUI-based. With GUI-based tools, building ETLs is drag-and-drop. Databricks entirely relies on coding, so you need skilled programmers to building your code, ETLs, etc.
What's my experience with pricing, setup cost, and licensing?
The price of Databricks is based on the computing volume. You also need to pay storage costs for the cloud where you're hosting Databricks, whether it is AWS, Azure, or Google.
What other advice do I have?
I rate Databricks nine out of 10. Databricks is one of the best tools on the market.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Microsoft Azure
Disclosure: My company has a business relationship with this vendor other than being a customer: Implementer
Team Lead at a tech services company with 1,001-5,000 employees
Gives us the ability to write analytics code in multiple languages
Pros and Cons
- "Databricks provides a consistent interface for data engineers to work with data in a consistent language on a single integrated platform for ingesting, processing, and serving data to the end user."
- "Databricks would benefit from enhanced metrics and tighter integration with Azure's diagnostics."
What is our primary use case?
We use Databricks for batch data processing and stream data processing.
How has it helped my organization?
Databricks provides a consistent interface for data engineers to work with data in a consistent language on a single integrated platform for ingesting, processing, and serving data to the end user.
What is most valuable?
The flexibility of Databricks is the most valuable feature. It gives us the ability to write analytics code in multiple languages.
There is a single workspace for different data roles like data engineers, machine learning engineers, and the end user, who can connect to the same system.
Databricks computes separate from storage, so you are not coupled with the underlying data sets, allowing for multiple processes and multiple programs to be written on the same code.
What needs improvement?
I would like to see improvement with the UI. It is functional and useful, but it's a bit clunky at times. It should be more user-friendly.
In future releases, Databricks would benefit from enhanced metrics and tighter integration with Azure's diagnostics.
For how long have I used the solution?
I have been using Databricks for eight months.
What do I think about the stability of the solution?
Databricks is very stable.
What do I think about the scalability of the solution?
The scalability of this solution is good. In our organization, users include analysts, data engineers, and data scientists.
How are customer service and support?
I would give Databrick service and support a four and a half out of five overall.
How would you rate customer service and support?
Positive
Which solution did I use previously and why did I switch?
Prior to using Databricks, we used Azure Stream Analytics. We made the switch because of the scalability and integrated platform.
How was the initial setup?
The initial setup of Databricks is more complex. I would rate it a four out of five on the complexity of the setup. It took two days to deploy the solution.
What about the implementation team?
We used a third party for some of the implementations of Databricks. The number of staff required to deploy and maintain this solution depends on the number of processes you have. Due to the cloud nature of the technology, it is easy to deploy and maintain.
What's my experience with pricing, setup cost, and licensing?
The licensing of Databricks is a tiered licensing regime, so it is flexible. I feel their pricing is a five out of five.
What other advice do I have?
Databricks is a one-stop shop for everything data related, and it can scale with you.
I would rate this solution a 9.5 out of 10 overall.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Head CEO at bizmetric
A user-friendly and customizable solution that offers excellent integration
Pros and Cons
- "The solution is built from Spark and has integration with MLflow, which is important for our use case."
- "The ability to customize our own pipelines would enhance the product, similar to what's possible using ML files in Microsoft Azure DevOps."
What is our primary use case?
Our use case is confidential, but I can say we use it for a deep learning model for machine learning.
What is most valuable?
The solution is built from Spark and has integration with MLflow, which is important for our use case.
Databricks is also user-friendly, providing customizable codes and models that allow people to experiment quickly.
Integration of Delta Lake is another useful feature.
What needs improvement?
Writing pandas-profiling reports could be easier.
The ability to customize our own pipelines would enhance the product, similar to what's possible using ML files in Microsoft Azure DevOps.
For how long have I used the solution?
I have been using this product for one and a half years.
What do I think about the stability of the solution?
For now the solution seems stable.
What do I think about the scalability of the solution?
The solution is easy to scale horizontally and it has a useful auto-scaling feature. For vertical scaling, you need to bring the system down and make some adjustments.
On my current project I have a team of 30 members under me, including data engineers and data science people. Our data science, engineering, and MLOps projects are expanding, so we are planning to do some vertical scaling to increase the team size to over 100 members. In our company, we are trying to certify more and more people in Databricks because it's cloud-agnostic.
How are customer service and support?
We have never needed to contact customer support, online resources have been sufficient to solve our problems.
How was the initial setup?
The initial setup of the solution is straightforward, once you understand the UI it is easy to implement. I would rate Databricks a four out of five for ease of setup.
One migration project took two to three months, including writing all the code and implementing end-to-end pipelines.
We are planning to deploy the solution in stages over the next 15 months to completely implement MLOps for our organization.
What's my experience with pricing, setup cost, and licensing?
I'm not involved in the financing, but I can say that the solution seemed reasonably priced compared to the competitors. Similar products are usually in the same price range. With five being affordable and one being expensive, I would rate Databricks a four out of five.
I find that deployed systems work out cheaper than having to operate manually, which appeals to our customers.
What other advice do I have?
I would rate this solution an eight out of ten.
There is an issue where clusters are automatically deleted after termination or after 100 days of non-usage. This could be more user-friendly, and they could include an enabler to pin the clusters you want to keep, instead of having to go and research why clusters got deleted after implementing the product. That documentation needs to be right in front of the user to avoid issues.
I definitely recommend this product to other users.
Which deployment model are you using for this solution?
Hybrid Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Microsoft Azure
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner

Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros
sharing their opinions.
Updated: February 2025
Popular Comparisons
Teradata
Dremio
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links
Learn More: Questions:
- Which do you prefer - Databricks or Azure Machine Learning Studio?
- How would you compare Databricks vs Amazon SageMaker?
- Which would you choose - Databricks or Azure Stream Analytics?
- Which product would you choose for a data science team: Databricks vs Dataiku?
- Which ETL or Data Integration tool goes the best with Amazon Redshift?
- What are the main differences between Data Lake and Data Warehouse?
- What are the benefits of having separate layers or a dedicated schema for each layer in ETL?
- What are the key reasons for choosing Snowflake as a data lake over other data lake solutions?
- Are there any general guidelines to allocate table space quota to different layers in ETL?
- What cloud data warehouse solution do you recommend?