

Dataiku and Dremio compete in the data management and analytics market. Dataiku offers a more integrated data science platform, favored for its comprehensive workflow and ease of use. Dremio excels in handling complex queries and offers powerful data management capabilities, making it ideal for environments requiring robust data query performance.
Features: Dataiku provides exceptional workflow organization features, integrates seamlessly with multiple programming languages, and offers a suite for machine learning with automation capabilities. In comparison, Dremio stands out in managing data across different sources, supports open-source projects like Arrow and Nessie, and enhances data retrieval with its federated querying abilities.
Room for Improvement: Dataiku could benefit from enhancing its pricing model, improving server stability, and providing better support for non-technical users. It might also improve deep learning integration and add better visualization features. Dremio could improve in areas such as query execution time, particularly for complex queries, and could expand connector availability and interface usability while working on dynamic scaling of clusters.
Ease of Deployment and Customer Service: Dataiku offers versatile deployment options, such as on-premises, private, public, and hybrid cloud environments, providing flexibility, though its customer service receives mixed reviews regarding delays and billing issues. Dremio also supports diverse deployment strategies and is noted for prompt and effective customer service responses.
Pricing and ROI: Dataiku is known for its higher pricing, making it more suitable for larger enterprises, yet it promises a strong ROI through improved data efficiency. Dremio offers a cost-competitive alternative, providing substantial flexible data management benefits, though scaling may lead to higher licensing costs. Both platforms have the potential for positive ROIs, yet pricing remains a key factor in adoption decisions.
The market is competitive, and Dataiku must adopt a consumption-based model instead of the current monthly model.
I consider the return on investment with Dataiku valuable because for us, it is one single platform where all our data scientists come together and work on any model building, so it is collaboration, plus having everything in one place, organized, having proper project management, and then built-in capabilities which help to facilitate model building.
In terms of ROI, the use of Dataiku simplifies the architecture of customers, which helps them to decommission some of their existing tools;
Dremio surely saves time, reduces costs, and all those things because we don't have to worry so much about the infrastructure to make the different tools communicate.
Dataiku partners with local industry experts who understand the business better and provide support.
The support team does not provide adequate assistance.
They should not take the complaints so lightly.
We have had to reach out for customer support many times, and they respond, so they are pretty supportive about some long-term issues.
Dataiku is quite scalable, as long as I can pay for more licenses, there is no technical limitation.
Dremio's scalability can handle growing data and user demands easily.
Internally, if it's on Docker or Kubernetes, scalability will be built into the system.
In terms of stabilization, if my data has no outlier creation in the raw data, then it is quite stable.
As for stability and reliability, so far so good; after the installation, I really had no problems.
I rate Dremio a nine in terms of stability.
Someone who needs to do coding can do it, and someone who does not know coding can also build solutions.
The license is very expensive.
I would love for Dataiku to allow more flexibility with code-based components and provide the possibility to extend it by developing and integrating custom components easily with existing ones.
Starburst comes with around 50 connectors now.
It should be easier to get Arctic or an open-source version of Arctic onto the software version so that development teams can experiment with it.
I see that many times the new versions of Dremio have not fixed old bugs, and in some new versions, old problems that were previously fixed come back again, so I think the upgrade part could use improvement.
There are no extra expenses beyond the existing licensing cost.
I find the pricing of Dataiku quite affordable for our customers, as they are usually large companies.
The pricing for Dataiku is very high, which is its biggest downside.
This feature is useful because it simplifies tasks and eliminates the need for a data scientist.
Dataiku primarily enhances the speed at which our customers can develop or train their machine learning models because it is a drag-and-drop platform.
It offers most of the capabilities required for data science, MLOps, and LLMOps.
Having everything under one system and an easier-to-work-with interface, along with having API integrations, adds significant value to working with Dremio.
Dremio has positively impacted my organization as nowadays we are connected to multiple databases from multiple environments, multiple APIs, and applications, and Dremio organizes everything in an amazing way for me.
You just get the source, connect the data, get visualization, get connected, and do whatever you want.
| Product | Mindshare (%) |
|---|---|
| Dataiku | 6.7% |
| Dremio | 2.4% |
| Other | 90.9% |


| Company Size | Count |
|---|---|
| Small Business | 4 |
| Midsize Enterprise | 2 |
| Large Enterprise | 11 |
| Company Size | Count |
|---|---|
| Small Business | 1 |
| Midsize Enterprise | 5 |
| Large Enterprise | 5 |
Dataiku Data Science Studio is acclaimed for its versatile capabilities in advanced analytics, data preparation, machine learning, and visualization. It streamlines complex data tasks with an intuitive visual interface, supports multiple languages like Python, R, SQL, and scales efficiently for large dataset handling, boosting organizational efficiency and collaboration.
Dremio offers a comprehensive platform for data warehousing and data engineering, integrating seamlessly with data storage systems like Amazon S3 and Azure. Its main features include scalability, query federation, and data reflection.
Dremio's core strength lies in its ability to function as a robust data lake query engine and data warehousing solution. It facilitates the creation of complex queries with ease, thanks to its support for Apache Airflow and query federation across endpoints. Despite challenges with Delta connector support, complex query execution, and expensive licensing, users find it valuable for managing ad-hoc queries and financial data analytics. The platform aids in SQL table management and BI traffic visualization while reducing storage costs and resolving storage conflicts typical in traditional data warehouses.
What are Dremio's most valuable features?Dremio is primarily implemented in industries requiring extensive data engineering and analytics, including finance and technology. Companies use it for constructing data frameworks, efficiently processing financial analytics, and visualizing BI traffic. It acts as a viable alternative to AWS Glue and Apache Hive, integrating seamlessly with multiple databases, including Oracle and MySQL, offering robust solutions for data-driven strategies. Despite some challenges, its ability to reduce data storage costs and manage complex queries makes it a favorable choice among enterprise users.
We monitor all Data Science Platforms reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.