Our primary use case of this product is for our customers who are running large systems and looking for an API -- a quick, easy integration with their own system. We use Databricks to create a secure API interface. I'm vice president of data science and we are customers of Databricks.
Vice President at a tech services company with 51-200 employees
Very easy to use and requires minimal coding and customizations
Pros and Cons
- "Easy to use and requires minimal coding and customizations."
- "Doesn't provide a lot of credits or trial options."
What is our primary use case?
What is most valuable?
Databricks is quite easy to use and requires less coding and customizations than a solution like AWS SageMaker which I'd previously used on a lot of projects. Databricks enables more people to efficiently build and host their ML code. Another great aspect is that MLflow is already integrated with Databricks which makes a big difference. It enables us to track and monitor all our different experiments. We have mostly used the MLflow part and generic notebooks with the ML building machine learning model, as well as using Pytorch for some of our medical imaging. We were able to quickly deploy both these features without requiring anything extra.
What needs improvement?
I'm struggling a little because I wanted to do some POC solutions. I present a lot of projects in various forums and seminars and there aren't a lot of credits and trial options with Databricks. Even if we want to explore, we're not able to and that's a challenge. The solution is quite expensive.
For how long have I used the solution?
I've been using this solution for a year.
Buyer's Guide
Databricks
December 2024
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: December 2024.
824,067 professionals have used our research since 2012.
What do I think about the stability of the solution?
It's currently stable although we have not yet tested it with a huge volume of data. We'll focus on the performance and model serving capability in the near future. We're still carrying out performance testing, developing the models and figuring out the infrastructure.
What do I think about the scalability of the solution?
Scalability is quite good because we just used 128 GB of resources. It's quite easy to scale.
How was the initial setup?
It was relatively simple, we didn't face any challenges. Deployment takes around two days.
Which other solutions did I evaluate?
We did a PSU in Azure ML Studio which is quite a good solution, easy to deploy and use. It's almost a no-code platform. We've also found Azure ML Studio to be quite cost-effective.
What other advice do I have?
I would recommend trying Databricks because it's cloud agnostic. A lot of customers currently use Azure but want to build something on their own down the track. Databricks makes that easy with its integration with other cloud customers. If somebody wants to build something on their infrastructure or their own virtual cloud, this is a good platform.
I rate the solution eight out of 10 because of the issue I'm having with a lack of trial options.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Enterprise Data Architect at a financial services firm with 51-200 employees
Assists with quickly computing a considerable amount of historical data and helps us with data ingestion
Pros and Cons
- "Its lightweight and fast processing are valuable."
- "The Databricks cluster can be improved."
What is our primary use case?
Our primary use case for this solution is for data ingestion and the DQ rules we are implementing. We deploy the solution on Azure cloud.
How has it helped my organization?
Whenever we send data to downstream applications for creating a file, multiple business rules are involved, and this solution assists with quickly computing a considerable amount of historical data.
What is most valuable?
Its lightweight and fast processing are valuable.
What needs improvement?
The product could include some UI features to improve the ease of use, like drag and drop for a few aggregated functions. Additionally, the Databricks cluster can be improved.
For how long have I used the solution?
We have been using Databricks for approximately two years and are currently using the latest version.
What do I think about the stability of the solution?
The solution is very stable. However, sometimes it intermittently restarts. I rate the stability an eight out of ten.
What do I think about the scalability of the solution?
The solution is scalable, and we are trying to implement more use cases with Databricks in our organization as we advance. I rate the scalability an eight out of ten.
How are customer service and support?
I rate customer service and support a nine out of ten.
How would you rate customer service and support?
Positive
How was the initial setup?
The initial setup was not very complex. We deploy the solution manually and the time required depends on the complexity of the business logic. I rate it an eight out of ten.
What about the implementation team?
We implemented the solution through an in-house team.
What other advice do I have?
I rate the solution an eight out of ten.
Disclosure: My company has a business relationship with this vendor other than being a customer: Gold Partners
Buyer's Guide
Databricks
December 2024
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: December 2024.
824,067 professionals have used our research since 2012.
Global Data Architecture and Data Science Director at FH
Flexible with support for several programming languages, good visualization and workload management functionality
Pros and Cons
- "Databricks gives you the flexibility of using several programming languages independently or in combination to build models."
- "Databricks requires writing code in Python or SQL, so if you're a good programmer then you can use Databricks."
What is our primary use case?
The primary use is for data management and managing workloads of data pipelines.
Databricks can also be used for data visualization, as well as to implement machine learning models. Machine learning development can be done using R, Python, and Spark programming.
What is most valuable?
Databricks gives you the flexibility of using several programming languages independently or in combination to build models.
The quick visualization of the data is very good.
The workload management functionality works well.
What needs improvement?
Databricks requires writing code in Python or SQL, so if you're a good programmer then you can use Databricks.
For how long have I used the solution?
I have been using Databricks since 2017. I am no longer using it personally, although my team is, and will continue to do so in the future.
What do I think about the stability of the solution?
Databricks is quite popular these days and it appears to be stable. I have not found any issues with stability.
What do I think about the scalability of the solution?
Databricks is scalable, regardless of which cloud provider is being used. It is supported on Microsoft Azure, AWS, and they have their own cloud as well.
For a small workload, Databricks may not be worth the costs. However, for larger workloads, Databricks is a very good solution.
In my previous organization, there were between 10 and 15 users.
How are customer service and technical support?
The technical support is handled by Microsoft partners and because we had premium support, it was easy to get. That said, I did not require any support.
Which solution did I use previously and why did I switch?
I have not used tools that are similar to Databricks for workload management, but Azure ADFv2, Google BigQuery, SAS are some the most powerful tools in this space, that I have used in the past. I have also heard of Dataiku and other tools but I have not used them. The only things that I have used are tools written in Python or scripting languages.
How was the initial setup?
There is no installation required.
What's my experience with pricing, setup cost, and licensing?
Databricks uses pay-per-use model, where you can use as much compute as you need. I think that the cost can be reduced, given that there are more users on the platform, although it is not as expensive as some other solutions like SAS.
What other advice do I have?
As we transition to the Azure cloud, I expect that we will be using Databricks for workloads.
This is a product that I recommend for those who want to scale and have a good budget. It is good for automating a data pipeline and managing workloads. My advice for anybody who is starting to use it is to take the proper training.
Overall, based on my uses, I think that this product is pretty good.
I would rate this solution an eight out of ten.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Lead Architect at Birlasoft IndiaLtd.
Data analytics platform that supports large volumes of data and related activities
Pros and Cons
- "This solution offers a lake house data concept that we have found exciting. We are able to have a large amount of data in a data lake and can manage all relational activities."
- "The connectivity with various BI tools could be improved, specifically the performance and real time integration."
What is most valuable?
This solution offers a lake house data concept that we have found exciting. We are able to have a large amount of data in a data lake and can manage all relational activities. All asset complaints properties are available and this is very useful to ensure the quality of all data.
What needs improvement?
The connectivity with various BI tools could be improved, specifically the performance and real time integration. There is also some improvement required in the semantic layers to manage the data match as well as the data warehouse features.
In a future release, we would like to have features to better manage all ML development activities.
For how long have I used the solution?
I have been using this solution for three years.
What do I think about the stability of the solution?
This is a stable solution, especially compared to other technology on the market.
What do I think about the scalability of the solution?
It is a scalable solution but this depends on the platform that is being used. If you use a cloud platform such as Azure, it offers scalability. However, some platforms will not support scalability using Databricks.
We have around 20 users in our development team using Databricks.
How are customer service and support?
The customer service and support for this solution is good.
How would you rate customer service and support?
Positive
How was the initial setup?
The initial setup is pretty simple and requires minimal configuration compared to other technology.
What's my experience with pricing, setup cost, and licensing?
I would rate the pricing for this solution a four out of five. This does depend on the environment or the infrastructure that one is using. There is a difference in pricing between using Azure or being on-premises.
Which other solutions did I evaluate?
Azure Synapse is a competitor that we evaluated but it is not mature enough to provide better performance than Databricks. We choose Databricks due to the ability to have a lot of data in Data Lakes and the Data Warehouse. We are also able to run data science activities using ML flow.
What other advice do I have?
If you are looking for custom model development and a lot of data management in a cloud agnostic manner, then Databricks is a good solution.
I would rate this solution an eight out of ten.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Pre-sale Leader, Big Data Enterprise Solutions at Ness Technologies
Easy to load and query data with SQL support, but it is difficult to deploy and the interface could be improved
Pros and Cons
- "The most valuable feature is the ability to use SQL directly with Databricks."
- "I have seen better user interfaces, so that is something that can be improved."
What is our primary use case?
My division works with Big Data and Data Science, and Databricks is one of the tools for Big Data that we work with. We are partners with Microsoft and we began working with this solution for one specific project in the financial industry.
What is most valuable?
The most valuable feature is the ability to use SQL directly with Databricks. That is the most relevant thing for my current project.
After deployment, it is easy to load files and query data.
What needs improvement?
I have seen better user interfaces, so that is something that can be improved.
It was quite hard to deploy.
For how long have I used the solution?
I have been using Databricks for about one year.
What do I think about the stability of the solution?
We have not found any bugs yet, although it is only the beginning of our work. I do not have enough information to say for sure.
What do I think about the scalability of the solution?
We have about 200 employees but it is only a small group using Databricks. We are at the beginning so scaling is not something we have had to do.
How are customer service and technical support?
We have not had to contact technical support because we are Microsoft partners and I am calling a colleague of mine who is helping me directly.
Which solution did I use previously and why did I switch?
I have used Snowflake and one of the differences is that Snowflake is much easier to deploy.
How was the initial setup?
The first deployment is difficult. It is not straightforward and you have to think about a lot of stuff. It is not really like a SaaS deployment and there are a lot of steps that you have to take.
What about the implementation team?
We have our own team, which includes colleagues from Microsoft. Because the current project is a large client, they would like to see this project succeed.
What's my experience with pricing, setup cost, and licensing?
We find Databricks to be very expensive, although this improved when we found out how to shut it down at night.
What other advice do I have?
Our client is a bank and some of the information can be shared outside of the organization, whereas some of the data is confidential and private. Using a purely on-premises solution would have made it more difficult to share information with the outside, which is one of the reasons that they wanted a cloud-based deployment.
My advice for anybody who is considering this solution is that it is very good for unstructured or semi-structured data. If, however, you have structured data then I would recommend a columnar database like Snowflake or Vertica. These solutions are easier to deploy.
This is a good solution that is working well, but I don't think that it is really a SaaS.
I would rate this solution a seven out of ten.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Microsoft Azure
Disclosure: My company has a business relationship with this vendor other than being a customer: Implementer
Practice Head, Data & Analytics at a tech vendor with 10,001+ employees
Key feature is ability to make changes in structure or data size and align for subsequent consumption
Pros and Cons
- "Can cut across the entire ecosystem of open source technology to give an extra level of getting the transformatory process of the data."
- "Implementation of Databricks is still very code heavy."
What is our primary use case?
We have a team that works on Databricks for our clients. We are customers of Databricks.
What is most valuable?
Databricks can cut across the entire ecosystem of open source technology which gives an extra level in terms of getting the transformatory process of the data. The solution is primarily open source and they have bolstered its components to make it more fit for purpose for a complete Azure Data platform. The solution is responsible for the core transformatory activities. While Azure Data Factory is very good for pulling in the data, doing the basic standardization and profiling, Databricks is more about making fundamental changes in structure or in size of the data and aligning it for subsequent consumption, or for the final layer on Synapse. It also has the power to complement and work with Spark and elements related to Python.
What needs improvement?
In my view, the fundamental approach of implementing Databricks is still very code heavy, more than you find in Azure Data Factory and other technologies like Informatica or SQL Server Integration Service. From my perspective, that could be improved. I'd also like to have the ability to facilitate predictive analytics within the solution.
For how long have I used the solution?
I've been using the solution for a year and a half.
What do I think about the stability of the solution?
Stability of the product is good, whether it's handling large volumes, diverse elements of data or processing data at speed. It has stood the test of time. It's a solution that really lends itself to that higher level of stability, versatility and diversity in terms of its capability to process different forms of data.
What's my experience with pricing, setup cost, and licensing?
The cost of the solution is slightly on the high side so it's important to use it efficiently.
What other advice do I have?
Use the solution wisely and in tandem with Azure Data Factory. Apply the prism in your overall design of the pipelines of the flow, to utilize to its potential. Databricks offers significant capability to the transformatory and data tranching capabilities in terms of diverse variety to Azure Data Stack per se. In terms of the license, ensure that the customer is getting what they paid for so that the value for money is realized.
I rate the solution eight out of 10.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Cloud & Infra Security, Group Manager at a tech vendor with 10,001+ employees
A scalable solution to quickly process and analyze streams of information
Pros and Cons
- "Databricks helps crunch petabytes of data in a very short period of time."
- "Costs can quickly add up if you don't plan for it."
What is our primary use case?
We are working with Databricks and SMLS in the financial sector for big data and analytics. There are a number of business cases for analysis related to debt there. Several clients are working with it, analyzing data collected over a period of time and planning the next steps in multiple business divisions.
My organization is a professional consulting service. We provide services for the other organizations, which implement and use them in a production environment. We manage, implement, and upgrade those services, but we don't use them.
What is most valuable?
Databricks helps crunch petabytes of data in a very short period of time for data scientists or business analysts. It helps with fraud analysis, finance, projections, etc. I like it.
This is exactly the purpose of big data and analytics. It provides the mechanism to process and analyze a stream of information. It's best for share analysis and stream analysis.
What needs improvement?
Costs can quickly add up if you don't plan for it.
For how long have I used the solution?
I've been using Databricks for just over a year.
What do I think about the stability of the solution?
Databricks is stable. It also helps that their support is included as part of the service.
What do I think about the scalability of the solution?
Databricks is scalable. The only issue is how much money you have for it. For example, if you need to run 100 servers, there's an eight-course with 256 gigabytes of RAM. You run out of money easily. It's charged to your credit card or your account, and you'll have to pay for it if you don't plan for that in advance.
How are customer service and technical support?
Databricks technical support is excellent. They provided their responses on time, and they're useful. However, I don't have extensive experience with them.
Which solution did I use previously and why did I switch?
I have used different Microsoft solutions before.
How was the initial setup?
The initial setup depends on the readiness of the team working with Databricks. There is no one template saying that it's easy, and it isn't easy. It can be complex to set up if you don't have a really good plan.
You can get in this environment at least for a test. You can do it in the lab, follow it step by step, and that'll take about an hour. The difficulty depends on the business requirements.
If it's for training purposes, you can do it in about half an hour, and you're good to go. If you need it to support a business, it will be much more rigorous because multiple divisions would be interested in running their own environment, working with their data.
What's my experience with pricing, setup cost, and licensing?
The price is okay. It's competitive.
What other advice do I have?
If you're thinking of implementing Databricks, I would recommend working with professionals. It'll help you save time. Also, plan the work and work the plan. Otherwise, it'll be a waste of time and money.
On a scale from one to ten, I would give Databricks a nine.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
IT Manager: User Support at a financial services firm with 10,001+ employees
Great technology that helps us decrease costs
Pros and Cons
- "It's great technology."
- "A lot of people are required to manage this solution."
What is our primary use case?
Our primary use case is to decrease costs and prevent any security press on data. I'm an IT manager and we are customers of Databricks.
What is most valuable?
I think what I value is more about the technology itself because you don't need to have too much knowledge to be able to use the solution.
What needs improvement?
I think we are using a lot of people to manage this solution. I'd like to see the people using this solution sharing their knowledge.
For how long have I used the solution?
We've been using this solution for around two years.
What do I think about the stability of the solution?
The stability is okay now although a month after the data load there was a limitation for the first time on the project. That sorted itself out.
What do I think about the scalability of the solution?
It's a scalable solution.
How are customer service and technical support?
We have a good connection with technical support.
What other advice do I have?
I think the point is that because we'll be working collaboratively in the future, internally and externally, we should compare experiences and exchange knowledge.
I would rate this solution an eight out of 10.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros
sharing their opinions.
Updated: December 2024
Popular Comparisons
Microsoft Azure Machine Learning Studio
KNIME
Alteryx
Amazon SageMaker
Dataiku
IBM SPSS Statistics
RapidMiner
Dremio
IBM Watson Studio
IBM SPSS Modeler
Anaconda
Domino Data Science Platform
Starburst Enterprise
H2O.ai
Cloudera Data Science Workbench
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links
Learn More: Questions:
- Which do you prefer - Databricks or Azure Machine Learning Studio?
- How would you compare Databricks vs Amazon SageMaker?
- Which would you choose - Databricks or Azure Stream Analytics?
- Which product would you choose for a data science team: Databricks vs Dataiku?
- Which are the best end-to-end data science platforms?
- What enterprise data analytics platform has the most powerful data visualization capabilities?
- What Data Science Platform is best suited to a large-scale enterprise?
- When evaluating Data Science Platforms, what aspect do you think is the most important to look for?
- How can ML platforms be used to improve business processes?
- Why is Data Science Platforms important for companies?