We use this solution for the Customer Data Platform(CDP). My company works in the MarTech space and usually we implement custom CDP.
Head of Business Integration and Architecture at Jakala
Highly scalable data platform that offers exceptional performance and value data types unique to this solution
Pros and Cons
- "The Delta Lake data type has been the most useful part of this solution. Delta Lake is an opensource data type and it was implemented and invented by Databricks."
- "The data visualization for this solution could be improved. They have started to roll out a data visualization tool inside Databricks but it is in the early stages. It's not comparable to a solution like Power BI, Luca, or Tableau."
What is our primary use case?
What is most valuable?
The Delta Lake data type has been the most useful part of this solution. Delta Lake is an opensource data type and it was implemented and invented by Databricks. It is the most important element of the solution. Databricks also offers exceptional performance and scalability.
What needs improvement?
The data visualization for this solution could be improved. They have started to roll out a data visualization tool inside Databricks but it is in the early stages. It's not comparable to a solution like Power BI, Luca, or Tableau.
In a future release, we would like to have a better ETL designer tool to assist in the way we move data from one place to another.
For how long have I used the solution?
We have been using this solution for four years.
Buyer's Guide
Databricks
December 2024
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: December 2024.
824,067 professionals have used our research since 2012.
What do I think about the stability of the solution?
This is a stable solution.
What do I think about the scalability of the solution?
This is a scalable solution.
How was the initial setup?
The initial setup is very easy. It is a managed solution inside Azure so you just need to search for Databricks. There are a couple of pages to follow in the setup wizard and Databricks is up and running.
What's my experience with pricing, setup cost, and licensing?
We implement this solution on behalf of our customers who have their own Azure subscription and they pay for Databricks themselves. The pricing is more expensive if you have large volumes of data.
Which other solutions did I evaluate?
When we first started using Databricks in 2018, there were not many comarable solutions to consider. Right now there are many solutions to consider including Snowflake, Azure Synapse, Redshift and BigQuery.
Databricks continues to be our solution of choice but Snowflake does have a better user interface and is easier to work with the data pipelines and with the overall UI.
What other advice do I have?
I would advise others to first define a strong data strategy and then choose which data platform suits your needs.
I would rate this solution a nine out of ten.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Data Analyst at Eviny
Fast and does what it needs to but customer service should be improved upon
Pros and Cons
- "It is fast, it's scalable, and it does the job it needs to do."
- "I would like to see the integration between Databricks and MLflow improved. It is quite hard to train multiple models in parallel in the distributed fashions. You hit rate limits on the clients very fast."
What needs improvement?
I would like to see the integration between Databricks and MLflow improved. It is quite hard to train multiple models in parallel in the distributed fashions. You hit rate limits on the clients very fast.
For how long have I used the solution?
I have been using Databricks for three years.
What do I think about the stability of the solution?
I would rate the stability of this solution a nine out of 10, with one being not stable and 10 being very stable.
What do I think about the scalability of the solution?
I would rate the scalability of this solution an eight out of 10, with one being not scalable and 10 being very scalable.
There are three people using this solution in our organization.
How are customer service and support?
I would rate the available customer service a three. It's worth mentioning that this is Microsoft and not Databricks itself. I haven't spoken to Databricks people directly, but I know the people who have and they have been a lot more pleased.
How would you rate customer service and support?
Negative
What's my experience with pricing, setup cost, and licensing?
I would rate their pricing plan a six (on a scale of one to 10, with one being cheap and 10 being expensive). I think the prices could be lowered a little bit.
What other advice do I have?
Overall, I would rate this solution an eight out of 10, with one being quite poor and 10 being excellent. It is fast, it's scalable, and it does the job it needs to do.
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
Buyer's Guide
Databricks
December 2024
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: December 2024.
824,067 professionals have used our research since 2012.
Team Lead at a tech services company with 1,001-5,000 employees
Gives us the ability to write analytics code in multiple languages
Pros and Cons
- "Databricks provides a consistent interface for data engineers to work with data in a consistent language on a single integrated platform for ingesting, processing, and serving data to the end user."
- "Databricks would benefit from enhanced metrics and tighter integration with Azure's diagnostics."
What is our primary use case?
We use Databricks for batch data processing and stream data processing.
How has it helped my organization?
Databricks provides a consistent interface for data engineers to work with data in a consistent language on a single integrated platform for ingesting, processing, and serving data to the end user.
What is most valuable?
The flexibility of Databricks is the most valuable feature. It gives us the ability to write analytics code in multiple languages.
There is a single workspace for different data roles like data engineers, machine learning engineers, and the end user, who can connect to the same system.
Databricks computes separate from storage, so you are not coupled with the underlying data sets, allowing for multiple processes and multiple programs to be written on the same code.
What needs improvement?
I would like to see improvement with the UI. It is functional and useful, but it's a bit clunky at times. It should be more user-friendly.
In future releases, Databricks would benefit from enhanced metrics and tighter integration with Azure's diagnostics.
For how long have I used the solution?
I have been using Databricks for eight months.
What do I think about the stability of the solution?
Databricks is very stable.
What do I think about the scalability of the solution?
The scalability of this solution is good. In our organization, users include analysts, data engineers, and data scientists.
How are customer service and support?
I would give Databrick service and support a four and a half out of five overall.
How would you rate customer service and support?
Positive
Which solution did I use previously and why did I switch?
Prior to using Databricks, we used Azure Stream Analytics. We made the switch because of the scalability and integrated platform.
How was the initial setup?
The initial setup of Databricks is more complex. I would rate it a four out of five on the complexity of the setup. It took two days to deploy the solution.
What about the implementation team?
We used a third party for some of the implementations of Databricks. The number of staff required to deploy and maintain this solution depends on the number of processes you have. Due to the cloud nature of the technology, it is easy to deploy and maintain.
What's my experience with pricing, setup cost, and licensing?
The licensing of Databricks is a tiered licensing regime, so it is flexible. I feel their pricing is a five out of five.
What other advice do I have?
Databricks is a one-stop shop for everything data related, and it can scale with you.
I would rate this solution a 9.5 out of 10 overall.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Head of Referential and Big Data at a financial services firm with 5,001-10,000 employees
A highly scalable unified data platform that provides data access to any type of user
Pros and Cons
- "I like cloud scalability and data access for any type of user."
- "It would be better if it were faster. It can be slow, and it can be super fast for big data. But for small data, sometimes there is a sub-second response, which can be considered slow. In the next release, I would like to have automatic creation of APIs because they don't have it at the moment, and I spend a lot of time building them."
What is our primary use case?
We use Databricks to define tool data and have many use cases to analyze and distribute the data.
How has it helped my organization?
Data is open to everyone; they can access it through many channels, including notebooks or SQL. That on its own democratizes the data.
What is most valuable?
I like cloud scalability and data access for any type of user.
What needs improvement?
It would be better if it were faster. It can be slow, and it can be super fast for big data. But for small data, sometimes there is a sub-second response, which can be considered slow.
In the next release, I would like to have automatic creation of APIs because they don't have it at the moment, and I spend a lot of time building them.
For how long have I used the solution?
I have been using Databricks for roughly one and a half years.
What do I think about the stability of the solution?
Stability is excellent.
What do I think about the scalability of the solution?
Databricks is scalable. You can use the power of the cloud to scale your cluster size, either CPU or memory. The data doesn't work like a standard database, so you don't have it based on files, and you don't copy the data. It's super scalable. It's only the computing that you have to scale with the data.
We probably have 40 users with roles like developers, business analysts, and data scientists. We have big plans to increase the usage and have more departments using it.
How are customer service and support?
Technical support has helped us.
On a scale from one to ten, I would give technical support a five.
How would you rate customer service and support?
Positive
Which solution did I use previously and why did I switch?
We used Cloudera before switching to Databricks.
How was the initial setup?
The initial setup was fairly okay. It takes about two minutes to deploy this solution. It's all code, so we click a button, and then it's done.
On a scale from one to five, I would give the initial setup a four.
What about the implementation team?
We set up and deployed this solution.
What was our ROI?
On a scale from one to five, I would give our ROI a three.
What's my experience with pricing, setup cost, and licensing?
We only pay for the Azure compute behind the solution. If you want to compute, you have to have a database layer and Azure below.
On a scale from one to five, I would give their pricing a two.
Which other solutions did I evaluate?
We looked at other options such as Snowflake and Cloudera on the cloud,
What other advice do I have?
I would tell potential users that they need proper cloud engineers and a
cloud infrastructure team to use this solution.
On a scale from one to ten, I would give Databricks a nine.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Head of Credit Risk and Data at Cegid Invoice and Financing
It's a reasonably priced all-in-one platform that enables us to build a lakehouse framework
Pros and Cons
- "Databricks gives us the ability to build a lakehouse framework and do everything implicit to this type of database structure. We also like the ability to stream events. Databricks covers a broad spectrum, from reporting and machine learning to streaming events. It's important for us to have all these features in one platform."
- "I'm not the guy that I'm working with Databricks on a daily basis. I'm on the management team. However, my team tells me there are limitations with streaming events. The connectors work with a small set of platforms. For example, we can work with Kafka, but if we want to move to an event-driven solution from AWS, we cannot do it. We cannot connect to all the streaming analytics platforms, so we are limited in choosing the best one."
What is our primary use case?
We primarily use Databricks for reporting and machine learning.
What is most valuable?
Databricks gives us the ability to build a lakehouse framework and do everything implicit to this type of database structure. We also like the ability to stream events. Databricks covers a broad spectrum, from reporting and machine learning to streaming events. It's important for us to have all these features in one platform.
What needs improvement?
I'm not the guy that I'm working with Databricks on a daily basis. I'm on the management team. However, my team tells me there are limitations with streaming events. The connectors work with a small set of platforms. For example, we can work with Kafka, but if we want to move to an event-driven solution from AWS, we cannot do it. We cannot connect to all the streaming analytics platforms, so we are limited in choosing the best one.
Also, this is an all-in-one platform, but it might be preferable if there were an a la carte model where we could select the best tool in each class for reporting, machine learning, etc. I'm not yet sure if this strategy is the best one.
For how long have I used the solution?
We've been using Databricks since the start of the year.
What do I think about the stability of the solution?
Databricks is quite stable. We haven't had any issues with stability. It's always working perfectly with no downtime.
What do I think about the scalability of the solution?
Databricks is based on Spark, which is based on Scala. These languages aren't easy to handle, and it's challenging to find people who know them well. At the same time, a couple of other vendors that work on top of Databricks are low-code platforms. We have to work around Databrick's lack of scalability by using low-code platforms that work on top of Databricks to give us scalability.
How are customer service and support?
I'll give Databricks support 10 out of 10. They are always prompt even though we didn't buy a support package. They have done an excellent job.
How would you rate customer service and support?
Positive
How was the initial setup?
Setting up Databricks is a bit complex, and the initial deployment took a few days—closer to a week. Of course, not everyone is working full-time on this. There are intervals when people are doing other stuff.
What was our ROI?
It's too soon to tell what kind of return we're getting because we just started using it, and we're still migrating.
What's my experience with pricing, setup cost, and licensing?
The cost of Databricks is in the lower range compared to other solutions. That was one of the main reasons we chose Databricks over other vendors and platforms.
We pay as we go, so there isn't a fixed price. It's charged by the unit. I don't have any details detail about how they measure this, but it should be a mix between processing and quantity of data handled. We run a simulation based on our use cases, which gives us an estimate. We've been monitoring this, and the costs have met our expectations.
What other advice do I have?
I give Databricks nine out of 10. The solution has met all our expectations. I'd recommend it to a friend. It's a reasonably priced all-in-one solution that gives us data lake and lakehouse capabilities. Those were the primary reasons we chose Databricks.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Helps users with data processing and analytics
Pros and Cons
- "The tool helps with data processing and analytics with large-scale data or big data since it is associated with managing data at a large scale."
- "The biggest problem associated with the product is that it is quite pricey."
What is our primary use case?
I use Databricks to manage the setting up of data lakes for SaaS.
What needs improvement?
The biggest problem associated with the product is that it is quite pricey. We cannot find a better solution than Databricks in the market currently.
For how long have I used the solution?
I have been using Databricks for a year.
What's my experience with pricing, setup cost, and licensing?
It is an expensive tool. The licensing model is a pay-as-you-go one.
What other advice do I have?
The tool helps with data processing and analytics with large-scale data or big data since it is associated with managing data at a large scale.
For my general use cases, I would say that I am not a technical person, so I cannot explain to you how the tool helps with the area of data engineering tasks.
There is another team in my company that is involved in the use of machine learning and AI features in Databricks. My team is mostly into operations. The tool is used in a multi-country project.
For example, in my company, they make some shopping decisions related to solutions based on what is the product chosen by the whole company.
I rate the tool an eight out of ten.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Last updated: Jul 17, 2024
Flag as inappropriateSr Manager Data Scientist at Bizmetric
A user-friendly and customizable solution that offers excellent integration
Pros and Cons
- "The solution is built from Spark and has integration with MLflow, which is important for our use case."
- "The ability to customize our own pipelines would enhance the product, similar to what's possible using ML files in Microsoft Azure DevOps."
What is our primary use case?
Our use case is confidential, but I can say we use it for a deep learning model for machine learning.
What is most valuable?
The solution is built from Spark and has integration with MLflow, which is important for our use case.
Databricks is also user-friendly, providing customizable codes and models that allow people to experiment quickly.
Integration of Delta Lake is another useful feature.
What needs improvement?
Writing pandas-profiling reports could be easier.
The ability to customize our own pipelines would enhance the product, similar to what's possible using ML files in Microsoft Azure DevOps.
For how long have I used the solution?
I have been using this product for one and a half years.
What do I think about the stability of the solution?
For now the solution seems stable.
What do I think about the scalability of the solution?
The solution is easy to scale horizontally and it has a useful auto-scaling feature. For vertical scaling, you need to bring the system down and make some adjustments.
On my current project I have a team of 30 members under me, including data engineers and data science people. Our data science, engineering, and MLOps projects are expanding, so we are planning to do some vertical scaling to increase the team size to over 100 members. In our company, we are trying to certify more and more people in Databricks because it's cloud-agnostic.
How are customer service and support?
We have never needed to contact customer support, online resources have been sufficient to solve our problems.
How was the initial setup?
The initial setup of the solution is straightforward, once you understand the UI it is easy to implement. I would rate Databricks a four out of five for ease of setup.
One migration project took two to three months, including writing all the code and implementing end-to-end pipelines.
We are planning to deploy the solution in stages over the next 15 months to completely implement MLOps for our organization.
What's my experience with pricing, setup cost, and licensing?
I'm not involved in the financing, but I can say that the solution seemed reasonably priced compared to the competitors. Similar products are usually in the same price range. With five being affordable and one being expensive, I would rate Databricks a four out of five.
I find that deployed systems work out cheaper than having to operate manually, which appeals to our customers.
What other advice do I have?
I would rate this solution an eight out of ten.
There is an issue where clusters are automatically deleted after termination or after 100 days of non-usage. This could be more user-friendly, and they could include an enabler to pin the clusters you want to keep, instead of having to go and research why clusters got deleted after implementing the product. That documentation needs to be right in front of the user to avoid issues.
I definitely recommend this product to other users.
Which deployment model are you using for this solution?
Hybrid Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Microsoft Azure
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
Machine Learning Engineer at a mining and metals company with 10,001+ employees
Highly scalable, stable and good technical support
Pros and Cons
- "Databricks is a scalable solution. It is the largest advantage of the solution."
- "The interface of Databricks could be easier to use when compared to other solutions. It is not easy for non-data scientists. The user interface is important before we had to write code manually and as solutions move to "No code AI" it is critical that the interface is very good."
What is our primary use case?
We were using Databricks to build an AI solution. We are only evaluating it, we have approximately three people that tried it out. Later we choose another solution, we did not fully deploy Databricks.
How has it helped my organization?
Before I used Databricks it took me a long time to do some functions and now with Databricks I can do them much quicker. It scales very well.
What needs improvement?
The interface of Databricks could be easier to use when compared to other solutions. It is not easy for non-data scientists. The user interface is important before we had to write code manually and as solutions move to "No code AI" it is critical that the interface is very good.
For how long have I used the solution?
I have used Databricks within the last 12 months.
What do I think about the stability of the solution?
The solution is stable.
What do I think about the scalability of the solution?
Databricks is a scalable solution. It is the largest advantage of the solution.
How are customer service and support?
We have been in contact with the technical support of Databricks, they were good.
Which solution did I use previously and why did I switch?
We have used a lot of different solutions, such as Watson and DataIQ.
How was the initial setup?
The initial setup is easy. However, I do not know much about the implementation because the company does it.
What about the implementation team?
We did the implementation of the solution.
What other advice do I have?
If companies want scalability, they should choose Databricks.
I rate Databricks a nine out of ten.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros
sharing their opinions.
Updated: December 2024
Popular Comparisons
Microsoft Azure Machine Learning Studio
KNIME
Alteryx
Amazon SageMaker
Dataiku
IBM SPSS Statistics
RapidMiner
Dremio
IBM Watson Studio
IBM SPSS Modeler
Anaconda
Domino Data Science Platform
Starburst Enterprise
H2O.ai
Cloudera Data Science Workbench
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links
Learn More: Questions:
- Which do you prefer - Databricks or Azure Machine Learning Studio?
- How would you compare Databricks vs Amazon SageMaker?
- Which would you choose - Databricks or Azure Stream Analytics?
- Which product would you choose for a data science team: Databricks vs Dataiku?
- Which are the best end-to-end data science platforms?
- What enterprise data analytics platform has the most powerful data visualization capabilities?
- What Data Science Platform is best suited to a large-scale enterprise?
- When evaluating Data Science Platforms, what aspect do you think is the most important to look for?
- How can ML platforms be used to improve business processes?
- Why is Data Science Platforms important for companies?