

Databricks and Dremio compete in the data analytics and processing category. Databricks appears to have the upper hand in terms of advanced machine learning capabilities and extensive support for programming languages, offering a more comprehensive data science platform.
Features: Databricks is praised for its robust data processing and machine learning capabilities, ease of running large-scale analytics, and support for multiple programming languages. It offers effective collaboration features through its notebooks and integration with Azure Machine Learning. Dremio, on the other hand, excels in efficiently sitting on top of various data storages, using reflections for quick data access, and providing data lineage and providence, making it a strong choice for compliance-focused environments.
Room for Improvement: Databricks could enhance its visualization capabilities and improve integration with tools like Tableau or Power BI. Users also suggest a need for better error messages and a clearer cost structure. Dremio could enhance its SQL support, increase connector variety, and improve performance on complex queries, with more dynamic scaling policies and comprehensive documentation for its community version.
Ease of Deployment and Customer Service: Databricks is available across public and private clouds, with generally appreciated support, though some desire improved communication. Dremio supports public and hybrid cloud environments well, with praised documentation but a need for more use-case examples. Both platforms offer strong initial deployment assistance, but ongoing support experiences vary.
Pricing and ROI: Databricks employs a pay-per-use model recognized for its cost-effectiveness in handling large data volumes, despite some views of it being expensive due to cloud costs. It offers competitive ROI by offloading workloads from expensive systems. Dremio presents a less costly alternative compared to some competitors, though scaling licenses can be pricey, with its ROI appreciated for its data access and integration value.
This reduction in both time and money resulted in real-time impact and significant cost savings.
For a lot of different tasks, including machine learning, it is a nice solution.
When it comes to big data processing, I prefer Databricks over other solutions.
Dremio surely saves time, reduces costs, and all those things because we don't have to worry so much about the infrastructure to make the different tools communicate.
Whenever we reach out, they respond promptly.
As of now, we are raising issues and they are providing solutions without any problems.
I rate the technical support as fine because they have levels of technical support available, especially partners who get really good support from Databricks on new features.
We have had to reach out for customer support many times, and they respond, so they are pretty supportive about some long-term issues.
The sky's the limit with Databricks.
The patches have sometimes caused issues leading to our jobs being paused for about six hours.
Databricks is an easily scalable platform.
Dremio's scalability can handle growing data and user demands easily.
Internally, if it's on Docker or Kubernetes, scalability will be built into the system.
They release patches that sometimes break our code.
Although it is too early to definitively state the platform's stability, we have not encountered any issues so far.
Databricks is definitely a very stable product and reliable.
I rate Dremio a nine in terms of stability.
Adjusting features like worker nodes and node utilization during cluster creation could mitigate these failures.
We prefer using a small to mid-sized cluster for many jobs to keep costs low, but this sometimes doesn't support our operations properly.
We use MLflow for managing MLOps, however, further improvement would be beneficial, especially for large language models and related tools.
Starburst comes with around 50 connectors now.
It should be easier to get Arctic or an open-source version of Arctic onto the software version so that development teams can experiment with it.
I see that many times the new versions of Dremio have not fixed old bugs, and in some new versions, old problems that were previously fixed come back again, so I think the upgrade part could use improvement.
It is not a cheap solution.
I believe that in terms of credits for Databricks, we're spending between £15,000 and £20,000 a month.
Databricks' capability to process data in parallel enhances data processing speed.
The platform allows us to leverage cloud advantages effectively, enhancing our AI and ML projects.
The Unity Catalog is for data governance, and the Delta Lake is to build the lakehouse.
Having everything under one system and an easier-to-work-with interface, along with having API integrations, adds significant value to working with Dremio.
Dremio has positively impacted my organization as nowadays we are connected to multiple databases from multiple environments, multiple APIs, and applications, and Dremio organizes everything in an amazing way for me.
You just get the source, connect the data, get visualization, get connected, and do whatever you want.
| Product | Mindshare (%) |
|---|---|
| Databricks | 9.3% |
| Dremio | 2.4% |
| Other | 88.3% |


| Company Size | Count |
|---|---|
| Small Business | 27 |
| Midsize Enterprise | 12 |
| Large Enterprise | 56 |
| Company Size | Count |
|---|---|
| Small Business | 1 |
| Midsize Enterprise | 5 |
| Large Enterprise | 5 |
Databricks offers a scalable, versatile platform that integrates seamlessly with Spark and multiple languages, supporting data engineering, machine learning, and analytics in a unified environment.
Databricks stands out for its scalability, ease of use, and powerful integration with Spark, multiple languages, and leading cloud services like Azure and AWS. It provides tools such as the Notebook for collaboration, Delta Lake for efficient data management, and Unity Catalog for data governance. While enhancing data engineering and machine learning workflows, it faces challenges in visualization and third-party integration, with pricing and user interface navigation being common concerns. Despite needing improvements in connectivity and documentation, it remains popular for tasks like real-time processing and data pipeline management.
What features make Databricks unique?
What benefits can users expect from Databricks?
In the tech industry, Databricks empowers teams to perform comprehensive data analytics, enabling them to conduct extensive ETL operations, run predictive modeling, and prepare data for SparkML. In retail, it supports real-time data processing and batch streaming, aiding in better decision-making. Enterprises across sectors leverage its capabilities for creating secure APIs and managing data lakes effectively.
Dremio offers a comprehensive platform for data warehousing and data engineering, integrating seamlessly with data storage systems like Amazon S3 and Azure. Its main features include scalability, query federation, and data reflection.
Dremio's core strength lies in its ability to function as a robust data lake query engine and data warehousing solution. It facilitates the creation of complex queries with ease, thanks to its support for Apache Airflow and query federation across endpoints. Despite challenges with Delta connector support, complex query execution, and expensive licensing, users find it valuable for managing ad-hoc queries and financial data analytics. The platform aids in SQL table management and BI traffic visualization while reducing storage costs and resolving storage conflicts typical in traditional data warehouses.
What are Dremio's most valuable features?Dremio is primarily implemented in industries requiring extensive data engineering and analytics, including finance and technology. Companies use it for constructing data frameworks, efficiently processing financial analytics, and visualizing BI traffic. It acts as a viable alternative to AWS Glue and Apache Hive, integrating seamlessly with multiple databases, including Oracle and MySQL, offering robust solutions for data-driven strategies. Despite some challenges, its ability to reduce data storage costs and manage complex queries makes it a favorable choice among enterprise users.
We monitor all Data Science Platforms reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.