No more typing reviews! Try our Samantha, our new voice AI agent.
Data Engineer at a tech vendor with 1,001-5,000 employees
Real User
Top 20
May 28, 2025
Experiencing smooth performance and cost advantages over previous tools
Pros and Cons
  • "Databricks is definitely a very stable product and reliable."
  • "My experience with the pricing and licensing model is that it remains relatively expensive. Though it's less expensive than AWS, we still need a more cost-effective solution."

What is our primary use case?

The use case for Databricks is that we use the clustering for high big data processing within the cluster.

What is most valuable?

I think it is difficult to determine which feature of Databricks I enjoy the most since there are many valuable features.

What's valuable about Databricks to my organization is that it is more cost-effective and provides better performance than the current AWS tools and services they offer.

What needs improvement?

I am uncertain about specific improvements for Databricks.

It would be beneficial to make Databricks even more cost-effective.

For how long have I used the solution?

I have been using Databricks for two years.

Buyer's Guide
Databricks
May 2026
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: May 2026.
896,942 professionals have used our research since 2012.

What do I think about the stability of the solution?

My experience with Databricks has been smooth, and I haven't encountered any issues.

Databricks is definitely a very stable product and reliable.

How are customer service and support?

I have not used Databricks customer service or support.

Which solution did I use previously and why did I switch?

Before Databricks, I used Batch processing, Fargate, and possibly Kubernetes.

I switched from my previous solutions because they were either too expensive or too difficult to configure.

Which other solutions did I evaluate?

I have considered other solutions besides Databricks, such as Snowflake, but we haven't explored it extensively yet.

We are still early in our Snowflake experience, so we don't know the pros and cons compared to Databricks.

What other advice do I have?

My deployment model for Databricks is limited as I'm not a heavy user.

I am not the person who purchased Databricks, but it was possibly acquired through the AWS Marketplace.

I may not have utilized Databricks machine learning capabilities.

My experience with the pricing and licensing model is that it remains relatively expensive. Though it's less expensive than AWS, we still need a more cost-effective solution.

I would rate Databricks overall a nine out of ten.

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Data Engineer at CRAFT Tech
Real User
Top 20
May 15, 2025
Unifying data for analytical insights with smooth AI and machine learning integration
Pros and Cons
  • "I think Databricks is very good at facilitating AI and machine learning projects; they implement AI and machine learning models very well, and clients can run their models on Databricks."
  • "In my opinion, areas of Databricks that have room for improvement involve the dashboards. Until recently, everyone used third-party systems such as Power BI to connect to Databricks for dashboards and reports, but they're now coming up with their IBI dashboard, and I think they're on the right track to improve that even further."

What is our primary use case?

A typical use case for the solution is to build the data lakehouse for the client because they have a variety of source systems, and they want to unify that data into the lakehouse platform, where they want to use the data for analytical purposes and insights.

What is most valuable?

The most valuable features of Databricks are especially the Delta Lake and the Unity Catalog; those are the main features. The Unity Catalog is for data governance, and the Delta Lake is to build the lakehouse. Currently, they're coming up with workflow jobs, along with other supporting elements to create an end-to-end solution.

What needs improvement?

In my opinion, areas of Databricks that have room for improvement involve the dashboards. Until recently, everyone used third-party systems such as Power BI to connect to Databricks for dashboards and reports, but they're now coming up with their IBI dashboard, and I think they're on the right track to improve that even further.

For how long have I used the solution?

I have approximately four years of experience working with Databricks.

What do I think about the stability of the solution?

I would rate the stability of Databricks as highly stable, around nine out of ten.

What do I think about the scalability of the solution?

I would rate the scalability of this solution as very high, about nine out of ten.

How are customer service and support?

I rate the technical support as fine because they have levels of technical support available, especially partners who get really good support from Databricks on new features. For us, it's so far so good with no problems, and I would rate the support quality as eight out of ten.

How would you rate customer service and support?

Positive

How was the initial setup?

The initial setup of the Databricks solution is reasonably fair enough. It doesn't give any trouble to implement the solution, and I think it's fairly easy to set up and work on Databricks.

What was our ROI?

I can't say if there's seen an ROI from the solution because I do not have exposure in that area, although I think the people who decided to implement Databricks might have done all this analysis and POCs.

What other advice do I have?

My relationship with the vendor is that I'm not a partner of Databricks; I work for a client where we use the Databricks software for implementing the solutions.

My clients are usually enterprise-level organizations, but the area where they're implementing is medium level here, although it might go into enterprise level in the future.

Regarding the price of Databricks, I don't involve myself in those decisions.

I think Databricks is very good at facilitating AI and machine learning projects; they implement AI and machine learning models very well, and clients can run their models on Databricks. I believe they are in a better place compared to competitors such as Snowflake, and they are tying up with important companies such as SAP and Palantir.

Based on my experience, I would recommend Databricks to other people. Overall, I would rate this solution as one of the best, about eight out of ten, although I might not know some of the pitfalls; it's based on use case to use case, but for us, it's working well.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Disclosure: My company has a business relationship with this vendor other than being a customer. MSP
PeerSpot user
Buyer's Guide
Databricks
May 2026
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: May 2026.
896,942 professionals have used our research since 2012.
Dunstan Matekenya - PeerSpot reviewer
Data Scientist at a financial services firm with 10,001+ employees
Real User
Top 5Leaderboard
Jul 30, 2024
Process large-scale data sets and integrates with Apache Spark with notebook environment

What is our primary use case?

I primarily use Databricks to process large-scale data sets with Apache Spark. My main use case is processing large data sets, such as 600 GB or 800 GB.

What is most valuable?

Databricks integrates natively with Apache Spark, which I use as a processing engine for large-scale datasets. This native integration is one of its strengths. Another strength is that the platform makes it very easy to manage resources. For example, setting up a cluster of five or fifteen nodes is straightforward with Databricks. The notebook environment is also excellent, making it easy to perform various tasks.

What needs improvement?

While Databricks allows you to upload your packages, we encountered some limitations with its capabilities, particularly with Apache Spark, which also affected Databricks. We had issues working with spatial data. You had to go through many steps to find libraries that could process spatial data in a distributed fashion.

For how long have I used the solution?

I have been using Databricks since 2018.

What do I think about the scalability of the solution?

I might have a project that runs for one or two months, and perhaps I won't use it for six months. Self-service is one of its strengths. I can shut down everything and easily spin up resources when I need to use them again.  We have a dedicated group of fifty people who consistently use Databricks for analytics.

How was the initial setup?

The initial setup was very easy and took around 10-15 people. We have a data science infrastructure team helping with this.

What was our ROI?

Databricks stands out among most data platforms mainly because of its ease of use. The learning curve is not as steep, making it accessible for anyone to handle large-scale data processing on Databricks. This ease of use contributes positively to our return on investment. However, in our line of work, converting this efficiency into direct monetary gains can be challenging, given our nonprofit nature. 

What's my experience with pricing, setup cost, and licensing?

We purchased high-performance laptops to reduce our reliance on the cloud. The main issue was the cost. Internally, if I used Databricks, that cost would return to my team. There was a time when my monthly cost was around ten thousand dollars, which was quite high. Due to these costs, several teams, including ours, move away from using Databricks and other cloud providers. It became prohibitive, so we invested in our high-performance computers internally instead.

What other advice do I have?

Databricks provides ease of use for me, particularly due to its seamless integration with Apache Spark. This integration simplifies the process of conducting machine learning on large-scale datasets.

I recommend this solution 100%. Overall, I rate the solution an eight out of ten.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Monalisha Nayak - PeerSpot reviewer
Senior Data Engineer at Shell
Real User
Top 5
Nov 17, 2024
Transformative data analytics with enhanced AI functionalities and good value for money
Pros and Cons
  • "It offers AI functionalities that assist with code management and machine learning processes."
  • "While Databricks is generally a robust solution, I have noticed a limitation with debugging in the Delta Live Table, which could be improved."

What is our primary use case?

Databricks is used for transformations and streaming data processing. We utilize it primarily for data analytics, including the use of Delta Lake and Delta Life tables for ETL processes, dashboards for analysis, and the Unity catalog for role management.

How has it helped my organization?

Databricks improves our data analysis tasks with its powerful functionality, offering real-time analytics and machine learning features that help improve model accuracy. It is easy to use, which helps in saving time and, ultimately, costs.

What is most valuable?

The most valuable features of Databricks include the Delta Lake, a user-friendly interface, Delta Life tables for ETL, dashboard features for analysis, and the Unity catalog for role management. It also offers AI functionalities that assist with code management and machine learning processes.

What needs improvement?

While Databricks is generally a robust solution, I have noticed a limitation with debugging in the Delta Live Table, which could be improved. The issue with Delta type tables not loading into multiple places in a single pipeline has been fixed recently.

For how long have I used the solution?

I have been working with Databricks for four years.

How are customer service and support?

We regularly contact Databricks support and are satisfied with their service. I would rate them eight out of ten.

How would you rate customer service and support?

Positive

How was the initial setup?

The initial setup was straightforward after the first week. Deployment processes became quick and efficient using Git.

What's my experience with pricing, setup cost, and licensing?

In terms of cost-effectiveness, Databricks is worth the money.

What other advice do I have?

I'd rate the solution nine out of ten.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Heba Ismail - PeerSpot reviewer
Senior Data Engineer at a computer software company with 1,001-5,000 employees
Real User
Top 5
Nov 6, 2024
Enhancing data integration and processing across cloud services with seamless transformations
Pros and Cons
  • "It helps integrate data science and machine learning capabilities."
  • "Performance could be improved."

What is our primary use case?

I work in a project where I build data pipelines using Azure Data Factory. I ingest data from on-premises to Azure Data Lake. After that, I perform transformations using Databricks notebooks and Spark, building the Databricks bronze, silver, and gold layers. We export reports from the gold layer.

How has it helped my organization?

Recently, we started using Databricks in our organization. It helps integrate data science and machine learning capabilities.

What is most valuable?

The Unity Catalog is a central governance for all data around the workspaces, and also Databricks' integration capabilities with cloud services like Azure Event Hub and Azure Data Factory. It is user-friendly for data processing, and Spark is a strong language for big data processing.

What needs improvement?

Performance could be improved. It is crucial to check coding, configure Spark correctly, implement caching, and monitor performance metrics to enhance performance.

For how long have I used the solution?

I have used Databricks for over two years.

What do I think about the stability of the solution?

I would rate stability as eight out of ten. It is quite stable.

What do I think about the scalability of the solution?

Databricks is perfect for scalability. It is easy to scale clusters.

How are customer service and support?

I haven't faced any issues requiring customer support, so I don't have experience with their customer support.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We used Informatica before, which is perfect for data management solutions. We started using Databricks for its capabilities in data science and machine learning.

How was the initial setup?

I would rate the initial setup as nine out of ten. It is quite easy for someone experienced with Spark.

What's my experience with pricing, setup cost, and licensing?

For my company, it's okay to upgrade to Databricks because it's comparable in price to Informatica. It is not considered expensive for the company.

Which other solutions did I evaluate?

For machine learning, I used Python and its libraries manually. Prior to Databricks, there was no special tool used for these purposes.

What other advice do I have?

If a company focuses on data science and machine learning, I recommend using Databricks. It's a great solution in this field. For data management needs, Informatica is advantageous due to its comprehensive tools.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Karan  Sharma - PeerSpot reviewer
Data Analyst at Allianz
Real User
Aug 21, 2023
An easy to setup tool that provides its users with an insight into the metadata of the data they process
Pros and Cons
  • "The initial setup phase of Databricks was good."
  • "Scalability is an area with certain shortcomings. The solution's scalability needs improvement."

What is our primary use case?

My company uses Databricks to process real-time and batch data with its streaming analytics part. We use Databricks' Unified Data Analytics Platform, for which we have Azure as a solution to bring the unified architecture on top of that to handle the streaming load for our platform.

What is most valuable?

The most valuable feature of the solution stems from the fact that it is quite fast, especially regarding features like its computation and atomicity parts of reading data on any solution. We have a storage account, and we can read the data on the go and use that since we now have the unity catalog in Databricks, which is quite good for giving you an insight into the metadata of the data you're going to process. There are a lot of things that are quite nice with Databricks.

What needs improvement?

Scalability is an area with certain shortcomings. The solution's scalability needs improvement.

For how long have I used the solution?

I have been using Databricks for a few years. I use the solution's latest version. Though currently my company is a user of the solution, we are planning to enter into a partnership with Databricks.

What do I think about the stability of the solution?

It is a stable solution. Stability-wise, I rate the solution an eight to nine out of ten.

What do I think about the scalability of the solution?

It is a scalable solution. Scalability-wise, I rate the solution an eight to nine out of ten.

My company has a team of 50 to 60 people who use the solution.

How are customer service and support?

Sometimes, my company does need support from the technical team of Databricks. The technical team of Databricks has been good and helpful. I rate the technical support an eight out of ten.

How would you rate customer service and support?

Positive

How was the initial setup?

The initial setup phase of Databricks was good. You can spin up clusters and integrate those with DevOps as well. Databricks it's quite nice owing to its user-friendly UI, DPP, and workspaces.

The solution is deployed on the cloud.

The time taken for the deployment depends on the workload.

What's my experience with pricing, setup cost, and licensing?

I cannot judge whether the product is expensive or cheap since I am unaware of the prices of the other products, which are competitors of Databricks. The licensing costs of Databricks depend on how many licenses we need, depending on which Databricks provides a lot of discounts.

What other advice do I have?

It is a state-of-the-art product revolutionizing data analytics and machine learning workspaces. Databricks are a complete solution when it comes to working with data.

I rate the overall product an eight out of ten.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Solution Architect at a insurance company with 10,001+ employees
Real User
Jan 4, 2023
A nice interface with good features for turning off clusters to save on computing
Pros and Cons
  • "There are good features for turning off clusters."
  • "It would be nice to have more guidance on integrations with ETLs and other data quality tools."

What is our primary use case?

Our company uses the solution for big data and as an interface for analytics. 

We also create custom APIs to get data and provide SQL endpoints so users can access it over traditional tools like JDVC or ODBC. 

We use the solution on AWS and Azure. The data lake is wide open for departmental use. We have ten departments and two or three people from each department access the solution. 

How has it helped my organization?

The platform as a service allows us to ramp up a new database pretty fast. We deploy some of the infrastructure as a code. End users can access data immediately and connect with Power BI for reporting. 

What is most valuable?

There are good features for turning off clusters. Basically, if we aren't using it, then it is turned off. When a user starts accessing, it starts up so we save on computing. 

Our data lake team likes the interface very much because it is straightforward. Of, course you need to understand the different clusters when they are started. 

There are nice features for matching the learning and analytics. 

The security features allow us to integrate with the active directory and assign different people to different databases. 

The solution has good a good interface with Python. 

There is good integration with Azure so we can access the solution over the standard Azure interface and use the storage pro measure. 

What needs improvement?

It would be nice to have more guidance on integrations with ETLs and other data quality tools. The solution is not really a product for ETL or data quality so we use other DBT tools. 

For how long have I used the solution?

I have been using the solution for four months but my company has been using it for one year. 

What do I think about the stability of the solution?

The solution is very stable with no issues so I rate stability a ten out of ten. 

What do I think about the scalability of the solution?

The solution is scalable to the cluster size and Azure storage. 

Scalability is rated an eight out of ten. 

How are customer service and support?

I have not used technical support. 

The company has regular calls with Databricks and they are pretty good but are more on the technical presale side. 

Which solution did I use previously and why did I switch?

We previously used Azure's data lake product and possibly some Hortonworks. 

How was the initial setup?

The setup is not easy but also is not too complicated. An infrastructure needs to be set up first. We use Azure storage or SQL S3 and create private end points. 

This is maybe a little more complex or a bit different than other databases in the cloud. For a traditional setup, you need to also think about file systems and disks. Here, you just transform it into the storage and private end point. 

The first setup might be a bit of a struggle until you learn and understand what is necessary. 

What about the implementation team?

We implemented the solution in-house with support from Databricks. Two team members were involved in the implementation. 

Three team members handle ongoing development and maintenance. 

What's my experience with pricing, setup cost, and licensing?

The solution is affordable. 

What other advice do I have?

The solution is pretty good because it uses Azure's data lake storage. It is basically the tool on top that provides the SQL interface and APIs for Python. I like the solution because it enables people to work with it.

I rate the solution a nine out of ten. 

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Nabil Fegaiere1 - PeerSpot reviewer
Chief Executive Officer at dotFIT, LLC
Real User
Sep 7, 2023
A powerful solution that is easily integrated into a variety of platforms
Pros and Cons
  • "It's very simple to use Databricks Apache Spark."
  • "I would like more integration with SQL for using data in different workspaces."

What is our primary use case?

I am a Databricks service partner, and my customers use Azure Databricks and Data Factory.

What is most valuable?

It's very simple to use Databricks Apache Spark. It's really good for parallel execution to scale up the workload. In this context, the usage is more about virtual machines.

Using meta-stores like Hive was optional, and the solution is good for data science use cases. With the Authenticator Log, Databricks is good for data transformation and BI usage. We have a platform.

What needs improvement?

I would like more integration with SQL for using data in different workspaces. We use the user interface for some functionalities, while for others, we have to use SQL to create data sets and grant permissions. For example, when creating a cluster, we have to create it with some API or user interface. Creating a cluster with some properties using SQL grants the possibility of using SQL syntax. Integration with SQL will make Databricks easier to use by people who have experience with databases like Lakehouse, and they would be able to use the data lake and BI. More integration will help have one point of view for everyone using SQL syntax.

Integration with Kubernetes could also be good for minimizing the price because you can use Kubernetes instead of virtual machines. But that won't be easy.

For how long have I used the solution?

I have worked with the solution for four or five years, with some experience since 2016.

What do I think about the stability of the solution?

The solution is stable. The only problem with stability would be that people are not using it efficiently.

What do I think about the scalability of the solution?

The solution is good for scalability.

How was the initial setup?

When we have administration experience, the solution is not difficult to deploy. Technically, however, it's difficult because governance is more complex. For example, I have two warehouses on Databricks, which are clusters in this workspace, and we have to switch from workspace to workspace to have all this information. There is a system table that has all this, but I don't know if everyone can use these tables.

What's my experience with pricing, setup cost, and licensing?

Databricks are not costly when compared with other solutions' prices.

Which other solutions did I evaluate?

Databricks's functionalities are as good as solutions like Snowflake, BigQuery, and Redshift.

What other advice do I have?

People sometimes do not use the solution efficiently. They misunderstand databases, the usage of tables, and the performance. Many data engineers are very junior and don't have skills in that. Stability is more a customer problem than a problem with the product itself. One possible problem with the product is that there's no method to pause the usage of something. For example, we have to use the meta server or the data catalog in Synapse. But in Databricks, we have a choice to use a catalog or not, or Hive, which is always integrated, but we have to choose whether to use it or not. Many customers directly use the passes on Databricks, which causes performance and governance problems.

I can offer a lot of advice on Databricks, and one is to use meta stores like Unity Catalog or Hive Metastore. For incoming use cases, it's better to use Unity Catalog.

I rate Databricks a nine out of ten.

Disclosure: My company has a business relationship with this vendor other than being a customer. Partner
PeerSpot user
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros sharing their opinions.
Updated: May 2026
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros sharing their opinions.