We use it for data analysis and testing of high volume web user behavioral data.
Sr. Data Quality Analyst at Seek
Can use different technologies to do data analysis and can quickly get data
Pros and Cons
- "Databricks makes it really easy to use a number of technologies to do data analysis. In terms of languages, we can use Scala, Python, and SQL. Databricks enables you to run very large queries, at a massive scale, within really good timeframes."
- "Databricks has added some alerts and query functionality into their SQL persona, but the whole SQL persona, which is like a role, needs a lot of development. The alerts are not very flexible, and the query interface itself is not as polished as the notebook interface that is used through the data science and machine learning persona. It is clunky at present."
What is our primary use case?
What is most valuable?
Databricks makes it really easy to use a number of technologies to do data analysis. In terms of languages, we can use Scala, Python, and SQL. Databricks enables you to run very large queries, at a massive scale, within really good timeframes.
I'm starting to build a solution using Delta Live Tables and Delta Live pipelines, and it is proving to be exceptionally easy to use. I have also been able to quickly implement a pipeline.
What needs improvement?
Databricks has added some alerts and query functionality into their SQL persona, but the whole SQL persona, which is like a role, needs a lot of development. The alerts are not very flexible, and the query interface itself is not as polished as the notebook interface that is used through the data science and machine learning persona. It is clunky at present.
For how long have I used the solution?
I've been using Databricks for a year.
Buyer's Guide
Databricks
December 2024
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: December 2024.
824,067 professionals have used our research since 2012.
What do I think about the stability of the solution?
It is a stable and reliable solution. I'd rate stability at eight out of ten.
What do I think about the scalability of the solution?
Databricks is absolutely scalable, and I'd rate scalability at eight out of ten. We probably have between 60 and 100 users in our organization, and we hope to increase usage in the future.
How are customer service and support?
The technical support staff we have worked with have been amazing. They helped us initially with our Delta Live pipelines. I would give them a rating of ten out of ten.
How would you rate customer service and support?
Positive
Which solution did I use previously and why did I switch?
I have previously worked with Apache Hadoop, and Databricks is definitely a better product. It's much easier to get data quickly in Databricks. As a result, a lot of the drudgery is taken away. Whereas with Hadoop, it's a bit more tricky to get data together.
What's my experience with pricing, setup cost, and licensing?
We're charged on what the data throughput is and also what the compute time is.
What other advice do I have?
I'd strongly recommend giving Databricks a try. We have found it to be a fantastic tool that has accelerated some of our solutions. We're an AI-heavy shop, and there are a lot of data scientists using the MLflow capabilities. I hear a lot of good things from that side as well. From a data analysis point of view, Databricks has been fantastic, and I would rate it at eight on a scale from one to ten.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Data Science Lead at a mining and metals company with 10,001+ employees
Scalable and reliable, with helpful support
Pros and Cons
- "It can send out large data amounts."
- "It's not easy to use, and they need a better UI."
What is our primary use case?
We use this solution to build skill and text classification models.
What is most valuable?
The scalability brings value to this solution.
It can send out large data amounts.
What needs improvement?
The user experience can be improved.
It's not easy to use, and they need a better UI.
For how long have I used the solution?
I have been dealing with Databricks for more than five years.
We used this solution last five months ago and used the most current version during that time.
What do I think about the stability of the solution?
This solution is quite stable. We have not had any issues with stability.
What do I think about the scalability of the solution?
It's a scalable solution. Very few people are using this solution in our organization. Most don't have the skill.
How are customer service and technical support?
We were using the free version which did not have a lot of support.
We didn't really need support at the time. I had one conversation with them and they were very nice. They were helpful.
Which solution did I use previously and why did I switch?
We are using Dataiku for one project and also SageMaker. We have some issues with scalability using SageMaker, which is why we may be going back to Databricks.
SageMaker is a very specific AI tool.
How was the initial setup?
The initial setup was okay.
What's my experience with pricing, setup cost, and licensing?
There are many different versions.
We used the trial version, which was free.
What other advice do I have?
If you have a lot of data, Databricks is a good choice.
With the migration of Microsoft and Databricks, they make it easy. It's the direction to go in.
It's a very good tool. I would rate Databricks a nine out of ten.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Buyer's Guide
Databricks
December 2024
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: December 2024.
824,067 professionals have used our research since 2012.
Strategic Alliances& Ecosystems Manager at a outsourcing company with 501-1,000 employees
Helps to have a good data presence but needs to incorporate learning aspects
Pros and Cons
- "Databricks has helped us have a good presence in data."
- "The product should incorporate more learning aspects. It needs to have a free trial version that the team can practice."
What is our primary use case?
The product has helped in data fabrication.
How has it helped my organization?
Databricks has helped us have a good presence in data.
What needs improvement?
The product should incorporate more learning aspects. It needs to have a free trial version that the team can practice.
For how long have I used the solution?
I have been using the product for more than six months.
What do I think about the stability of the solution?
I rate Databricks' an eight out of ten.
What do I think about the scalability of the solution?
I rate the tool's scalability an eight out of ten.
How was the initial setup?
The transition to Databricks was smooth.
What's my experience with pricing, setup cost, and licensing?
Databricks' price is high.
What other advice do I have?
I rate the solution a nine out of ten.
Disclosure: My company has a business relationship with this vendor other than being a customer:
Principal Consultant/Manager at Tenzing
Processes tremendous data easily
Pros and Cons
- "The processing capacity is tremendous in the database."
- "There is room for improvement in the documentation of processes and how it works."
What is our primary use case?
Our primary use case is in our project; we are dealing with Duo Special Data, where we need a lot of computing resources. Here, the traditional warehouse cannot handle the amount of data we are using, and this is where Databricks comes into the picture.
What is most valuable?
The processing capacity is tremendous in the database. We are dealing with Azure as storage, so we have not faced any challenges. And also the connectors to different data sources. Moreover, it is not a language-dependent tool. Therefore, development also takes place faster. It is one of the best features of Databricks.
What needs improvement?
There is room for improvement in the documentation of processes and how it works. I was trying to get one of the certifications, so I saw an area of improvement there.
For how long have I used the solution?
I have been using Databricks for eight to nine months.
What do I think about the stability of the solution?
It is a stable product for us. We didn't see any challenges.
What do I think about the scalability of the solution?
There are around 30 to 35 users in our organization.
How was the initial setup?
The initial setup was easy because the third-party team made the clusters for us.
What about the implementation team?
A third-party team enabled the cluster to make the setup easy for us.
What other advice do I have?
I would advise using it based on the use case because it easily handles big data. It is your go-to tool if you are dealing with massive data.
Overall, I would rate the solution a nine out of ten. The tool performs well in various use cases, availability of documentation online, and compatibility with big data systems like GCP, Azure, or AWS.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Lead Data Scientist at a manufacturing company with 10,001+ employees
A great solution that has allowed for collaboration within our organization
Pros and Cons
- "We have the ability to scale, collaborate and do machine learning."
- "The product cannot be integrated with a popular coding IDE."
What is our primary use case?
Our primary use case for this solution is research for data scientists. The solution is deployed on cloud.
How has it helped my organization?
It has allowed our data engineers, data scientists, and analysts to collaborate and work on the same platform.
What is most valuable?
We have the ability to scale, collaborate and do machine learning.
What needs improvement?
The product cannot be integrated with a popular coding IDE.
For how long have I used the solution?
We have been using this solution for approximately three years.
What do I think about the stability of the solution?
The solution is stable.
What do I think about the scalability of the solution?
The solution is scalable. There are five people using it in our organization.
How are customer service and support?
I rate my experience with customer service and support an eight out of ten.
Which solution did I use previously and why did I switch?
We previously used H2O.
How was the initial setup?
The initial setup was straightforward.
What about the implementation team?
Implementation was done in-house.
What was our ROI?
We have seen a return on investments.
What's my experience with pricing, setup cost, and licensing?
Licensing costs are charged on a yearly basis and costs between 25,000 and 30,000.
Which other solutions did I evaluate?
We evaluated other options but this solution was the best fit for what we required.
What other advice do I have?
I rate this solution nine out of ten. The solution is good but can be improved by integrating with a popular coding IDE.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Sr Data Engineer at PIMCO
Supports several coding languages, good performance, and facilitates team collaboration
Pros and Cons
- "The load distribution capabilities are good, and you can perform data processing tasks very quickly."
- "In the future, I would like to see Data Lake support. That is something that I'm looking forward to."
What is our primary use case?
Our primary use case is ETL.
How has it helped my organization?
Using Databricks enables us to use the Data Mesh methodology, where every team performs their own ETL.
What is most valuable?
The most valuable feature is the versatility of the ecosystem. You can write code in SQL, Python, or Java.
The load distribution capabilities are good, and you can perform data processing tasks very quickly.
You can save and share notebooks between different teams.
The interface is easy to use.
What needs improvement?
The cost of this solution is high, on the expensive side.
In the future, I would like to see Data Lake support. That is something that I'm looking forward to.
For how long have I used the solution?
I worked with Databricks for approximately two years in my previous company.
What do I think about the scalability of the solution?
This is a very scalable solution. We have twenty-five data engineers that use it, and we may grow our usage.
How are customer service and support?
The technical support is okay. I would rate them a seven out of ten.
How would you rate customer service and support?
Neutral
Which solution did I use previously and why did I switch?
We did not use another similar solution prior to Databricks.
How was the initial setup?
The cloud-based deployment is simple.
If you use an on-premises deployment then there is more to do.
What about the implementation team?
We deployed it with our in-house team.
There is no maintenance required.
What was our ROI?
We have seen a return on our investment with Databricks.
What's my experience with pricing, setup cost, and licensing?
Price-wise, I would rate Databricks a three out of five.
Which other solutions did I evaluate?
When we looked into Databricks, we evaluated Azure Data Factory and some of the others on the market. We found that Databricks was one of the easiest ones to use.
What other advice do I have?
My advice for anybody that is looking into Databricks is not to use the on-premises deployment. Instead, use the cloud-based setup.
In summary, this is a good product.
I would rate this solution an eight out of ten.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Financial Analyst 4 (Supply Chain & Financial Analytics) at Juniper Networks
Easy to collaborate with other team members who are working on it
Pros and Cons
- "Databricks is hosted on the cloud. It is very easy to collaborate with other team members who are working on it. It is production-ready code, and scheduling the jobs is easy."
- "Databricks would have more collaborative features than it has. It should have some more customization for the jobs."
What is our primary use case?
We use the solution for reliability engineering, where we apply ML and Deep Learning models to identify the fear failure patterns across different geographies and products.
What is most valuable?
Databricks is hosted on the cloud. It is very easy to collaborate with other team members who are working on it. It is production-ready code, and scheduling the jobs is easy.
What needs improvement?
Databricks would have more collaborative features than it has. It should have some more customization for the jobs. Also, it has an average dashboarding tool. They can bring advanced features so we don't depend on other BI tools to build a dashboard. We are using Tableau to create a dashboard. If Databricks has more advanced features, we can entirely use Databricks.
For how long have I used the solution?
I have been using Databricks for one year.
What do I think about the stability of the solution?
The product is stable. It has been giving consistent outputs without any major issues.
What do I think about the scalability of the solution?
The solution is hosted on the cloud. It supports high scalability features.
10-20 users are using this solution.
How are customer service and support?
There was a training session from Databricks where they explained how to use it. We never had to contact them because they had already given us proper training on the platform.
Which solution did I use previously and why did I switch?
I have used Alteryx before. We switched to Databricks because it can compute and turn your code into production-ready code in very few seconds. Also, the stability is relatively high.
How was the initial setup?
The initial setup is easy.
What about the implementation team?
We have a dedicated team for the deployment.
What other advice do I have?
Delta Lake is a free system. We practically work on the data that we get from Snowflake. Databricks are returned to the model outputs that are returned to Delta Lake. It is easy for us to collaborate using Delta Lake, and the computation speed is also quite high for Delta Lake.
The learning curve for Databricks is not very steep. It's pretty easy, and you will find a lot of materials online. So, if you are comfortable coding in Python, it's very straightforward. There is nothing to worry about when using Databricks.
Overall, I rate the solution a ten out of ten.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Last updated: Mar 31, 2024
Flag as inappropriateData Architect at Three Ireland (Hutchison) - Infrastructure
Processes large data for data science and data analytics purposes
Pros and Cons
- "Specifically for data science and data analytics purposes, it can handle large amounts of data in less time. I can compare it with Teradata. If a job takes five hours with Teradata databases, Databricks can complete it in around three to three and a half hours."
- "There is room for improvement in visualization."
What is our primary use case?
It's mainly used for data science, data analytics, visualization, and industrial analytics.
What is most valuable?
Specifically for data science and data analytics purposes, it can handle large amounts of data in less time. I can compare it with Teradata. If a job takes five hours with Teradata databases, Databricks can complete it in around three to three and a half hours.
So that's why it's quite convenient to use for data science, for training machine learning models. By using more computing power, you can make it even faster.
What needs improvement?
There is room for improvement in visualization.
For how long have I used the solution?
I used it for two years. I worked with the latest update.
What do I think about the stability of the solution?
I would rate the stability a nine out of ten. I didn't face performance drops.
What do I think about the scalability of the solution?
I would rate the scalability an eight out of ten.
How are customer service and support?
Databrick's support is great. If we need any support, they are very quick with it. And they genuinely want you to use Databricks. So, whatever we ask them, they come up with multiple solutions to problem statements. That's really good.
Overall, the customer service and support are very good.
How would you rate customer service and support?
Positive
Which solution did I use previously and why did I switch?
I personally prefer using Databricks. However, we also considered using Snowflake, but the pricing was different. It's price per query.
So, as per your storage, a data scientist or a data analytics team needs to query again and again, which does not suit a data-heavy organization.
What was our ROI?
It's a good return on investment for Databricks from a delivery perspective. Delivered multiple dashboards. So, it's quite a good return on investment. And being a small organization, everyone can use Databricks, and cost-wise, it's also good for small organizations.
Which other solutions did I evaluate?
If the company is a startup, Databricks might be suitable. If a big company needs a lot of storage, Teradata might be best for them. It depends on the situation.
What other advice do I have?
Overall, I would rate the solution a eight out of ten. I would definitely recommend this solution for small organizations.
Which deployment model are you using for this solution?
Private Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros
sharing their opinions.
Updated: December 2024
Popular Comparisons
Microsoft Azure Machine Learning Studio
KNIME
Alteryx
Amazon SageMaker
Dataiku
IBM SPSS Statistics
RapidMiner
Dremio
IBM Watson Studio
IBM SPSS Modeler
Anaconda
Domino Data Science Platform
Starburst Enterprise
H2O.ai
Cloudera Data Science Workbench
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links
Learn More: Questions:
- Which do you prefer - Databricks or Azure Machine Learning Studio?
- How would you compare Databricks vs Amazon SageMaker?
- Which would you choose - Databricks or Azure Stream Analytics?
- Which product would you choose for a data science team: Databricks vs Dataiku?
- Which are the best end-to-end data science platforms?
- What enterprise data analytics platform has the most powerful data visualization capabilities?
- What Data Science Platform is best suited to a large-scale enterprise?
- When evaluating Data Science Platforms, what aspect do you think is the most important to look for?
- How can ML platforms be used to improve business processes?
- Why is Data Science Platforms important for companies?