Enterprise resource management, ease of use in terms of integration within the Hadoop ecosystem related products, and security.
R&D Solutions Architect at a tech vendor with 10,001+ employees
It has good ease of use in terms of integration within the Hadoop ecosystem related products.
What is most valuable?
What needs improvement?
Mainly they have to continuously evolve following the technology trends and replace or adapt part of their solutions accordingly.
For how long have I used the solution?
We've used it since October 2012.
What was my experience with deployment of the solution?
No issues encountered.
Buyer's Guide
Cloudera Distribution for Hadoop
February 2025
Learn what your peers think about Cloudera Distribution for Hadoop. Get advice and tips from experienced pros sharing their opinions. Updated: February 2025.
832,138 professionals have used our research since 2012.
What do I think about the stability of the solution?
No issues encountered.
What do I think about the scalability of the solution?
No issues encountered.
How are customer service and support?
Pretty responsive and reactive compared to their competitors in the field.
How was the initial setup?
It was extremely easy, and allowed less experienced personnel to get into the context pretty fast. Any difficulties/complexities faced were not related to the product itself rather than to the cluster infrastructure used.
What about the implementation team?
In our case it was an in-house team including data scientists and data engineers (management & QA as well). With the appropriate training and the support offered by the vendor, it is not that hard to implement a small to medium scale project solution. However, complexity and size varies significantly between projects; therefore, it really depends.
What was our ROI?
That is not easy to answer since Huawei has several divisions using the product in different ways. Again regarding pricing/licensing highly depends on the context and the aims of the given organization for instance the level of support they are going to need, the type of services they are going to provide, or even the business domain they are targeting.
Which other solutions did I evaluate?
There were two provider solutions that have been evaluated. However, the level of customer service and technical support from Cloudera was better than the first one, and the second solution licence pricing was higher compared to Cloudera’s pricing schema.
What other advice do I have?
Cloudera is doing a great job in the field offering an enterprise ready data platform. Based on my experiences I would definitely recommend it.
Disclosure: My company has a business relationship with this vendor other than being a customer: We do have a partnership with Cloudera.
Data engineer at a tech services company with 11-50 employees
Supports a wide range of tools and has a good support community
Pros and Cons
- "We also really like the Cloudera community. You can have any question and will have your answer within a few hours."
- "Without the big data environment, we cannot store all of this data live. We have billions of records and terabytes of storage to be used. It's not an option actually for us to have a big data environment."
What is our primary use case?
Our primary use case for this solution is to host a big amount of data in our platform, processing, analysis and all of this stuff on the platform.
What is most valuable?
Cloudera is always developing new tools and supports a wide range of tools. We also really like the Cloudera community. You can have any question and will have your answer within a few hours. Cloudera is better than other competitors because they acquired Hortonworks.
What needs improvement?
We're processing a huge amount of data on our system. Without the big data environment, we cannot store all of this data live. We have billions of records and terabytes of storage to be used. It's not an option actually for us to have a big data environment. Cloudera is trying to adopt new technologies.
I think the idea of open source tools now is dominating. So Cloudera has to decide how to deal with open-source tools. I subscribe to Cloudera to get an enterprise version but I have found that I can get some of its features from other vendors that would be at a lower cost than Cloudera. They should lower the price.
For how long have I used the solution?
We have been using Cloudera for a year.
What do I think about the stability of the solution?
It's stable. I have no issue regarding the stability.
What do I think about the scalability of the solution?
It's scalable. You can add more nodes and you can expand your cluster easily.
How are customer service and technical support?
After we open a ticket, the issue can be resolved very quickly, they have a management portal. I don't contact them directly, but I haven't heard anybody having any problems with it.
How was the initial setup?
The initial setup is complicated. We needed the vendor to install it themselves. The deployment took around three weeks. Three people were involved because they just follow up and supervise the deployment, but they're not deploying anything. The vendor does it.
What other advice do I have?
In terms of the advice, I would say to focus on what tools are available on the market. In terms of open-source, most companies are delivering open source technologies and providing support to these tools. Now I have the option to purchase a license for whatever platform for $1. I can deliver it with another small company at a lower cost. If I was the decision-maker, I'd invest in open-source tools. Cloudera and all of these companies are trying to adapt to these big data technologies and open source tools. Cloudera is trying to put it inside their platform so that we can have a compatible solution.
I would rate it an eight out of ten.
Which deployment model are you using for this solution?
On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Buyer's Guide
Cloudera Distribution for Hadoop
February 2025
Learn what your peers think about Cloudera Distribution for Hadoop. Get advice and tips from experienced pros sharing their opinions. Updated: February 2025.
832,138 professionals have used our research since 2012.
Data Consultant with 10,001+ employees
Features like Hive, Pig, Impala, Flume and Spark are valuable to us.
Valuable Features
Cloudera Manager is the most valuable feature for it’s ease of use, features, ease of upgrade and install components. CM can also be use to set up high availability within minutes. Others features like Hive, Pig, Impala, Flume and Spark are also valuable.
Improvements to My Organization
It's improved our storage and the availability of analytics tools such as Hive, Pig, Impala, and Spark helps us tremendously.
Room for Improvement
I'd like to see improvements to Impala. Also, it needs a more integrated environment with Spark, data warehouse, storage systems, cloud. Additionally, I'd want more UIs for components of ecosystem, preferably those UIs are centralized in a gateway.
Use of Solution
I've used it for 3.5 years.
Deployment Issues
For experimental and production clusters alike, use Cloudera Manager right from the beginning. RPM installation is good for learning.
Stability Issues
It has compatibility issues if installed in specialized hardware such as EMC Isilon or if node manager and data nodes are not co-located. For production, draw out a detailed plan on how to manage local repo for installation and upgrade. Never install from internet for production clusters.
Customer Service and Technical Support
Most of the clusters are for experimentation that don’t require support. For production clusters, implementations are through major vendors which are handled by them.
Initial Setup
It depends on mode of installation. Cloudera Manager is always more straight forward and manageable. Avoid RPM installation as much as possible. Lay out plans with system admin on upgrade plan, commission and decommission nodes. Investigate impact and consequences of having HBase and Hadoop in the same cluster or as separate cluster, what are the impacts on system admin, cost, upgrades, data migrations, resources, etc?
The complexity kicks in when performing parameter configurations. Find out what are the use cases, are there disk IO or compution IO bound, are there lots of structured data or unstructured data for text analytics, etc.
Implementation Team
Both vendor team and in-house depending on the cluster size and use cases. Some customers may require certain number of certified personnel, something to think about when choosing a partner.
Other Advice
Be prepared for fast changing landscape in how Hadoop works under the hood and how it is used. Each major release usually involved change of file system and data structure. How would they impact data migration. Ask questions like should they Upgrade or create a new cluster? Plans for training and skill upgrades.
Disclosure: My company has a business relationship with this vendor other than being a customer: We're a system integration partner.
Project Coordinator at a manufacturing company with 1,001-5,000 employees
Good search functionality but the user interface needs improvement
Pros and Cons
- "The search function is the most valuable aspect of the solution."
- "The user infrastructure and user interface needs to be improved, as well as the performance. The GUI needs to be better."
What is our primary use case?
We primarily use the solution for external storage.
What is most valuable?
The search function is the most valuable aspect of the solution.
What needs improvement?
The user infrastructure and user interface needs to be improved, as well as the performance. The GUI needs to be better.
For how long have I used the solution?
I've been using the solution for 1 year.
Which solution did I use previously and why did I switch?
We did not previously use a different solution.
How was the initial setup?
The initial setup was complex, due to the user interface. We were doing a POC, so we're still doing the deployment.
What other advice do I have?
I would rate this solution seven out of 10. There's tons of room for improvement.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Director of Data Architecture at a financial services firm with 501-1,000 employees
It has enabled us to move BI out of our OLTP database and build a data warehouse, but although Spark under rapid development, it needs improvement.
What is most valuable?
- Cloudera Manager
- Impala
- Sentry
How has it helped my organization?
It has enabled us to move BI out of our OLTP database and build a data warehouse.
What needs improvement?
Some areas are under rapid development, like Spark.
For how long have I used the solution?
I've used it for three years.
What was my experience with deployment of the solution?
No issues with the current version.
What do I think about the stability of the solution?
No issues with the current version.
What do I think about the scalability of the solution?
No issues with the current version.
How are customer service and technical support?
Customer Service:
It's excellent.
Technical Support:It's excellent.
Which solution did I use previously and why did I switch?
We switched because Cloudera just works.
How was the initial setup?
Cloudera Manager greatly simplifies initial setup.
What about the implementation team?
In-house.
What other advice do I have?
Make sure you have clearly articulated, doable use cases before you start.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Lead Instructor at a tech company with 501-1,000 employees
It has fairly matured tools like Cloudera Navigator and Cloudera Manager, but it is lacking Spark SQL support.
Valuable Features:
The features I find most valuable are--
- Enterprise security features (authentication, authorization, data governance, and data protection)
- Proactive support
- Training
Improvements to My Organization:
- Providing robust infrastructure
- Fairly matured tools like Cloudera Navigator, Cloudera Manager, etc.
- Professional support enabled us to provide great customer service
- Our clients are able to perform proactive maintenance in an efficient manner
Room for Improvement:
Spark with R integration is missing. Also, it is lacking Spark SQL support.
Use of Solution:
I've used it for over eight months.
Deployment Issues:
We faced issues in deploying Azure with Cloudera. Our machine hard disks were getting corrupted whenever we used to get patches on weekends. Now these have been resolved.
Customer Service:
They offer excellent support.
Initial Setup:
It was complex because we were doing first time deployment of Cloudera on Azure. Also complexity was high due to lot of security features.
Implementation Team:
We are Big Data consultants, so we implement it.
Other Solutions Considered:
Cloudera is a leader in providing distributions for Hadoop so it was no brainer for us to decide.
Other Advice:
There were initial hiccups when deploying Cloudera on Azure but now this combo is working fine in production, so you can go for it.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
IT expert at a comms service provider with 201-500 employees
Reliable, stable, but difficult to use
Pros and Cons
- "The solution is reliable and stable, it fits our requirements."
- "The procedure for operations could be simplified."
What is our primary use case?
We are in the testing phase of Cloudera Distribution for Hadoop, and we will be in production soon.
What needs improvement?
The procedure for operations could be simplified.
For how long have I used the solution?
I have used Cloudera Distribution for Hadoop within the past 12 months.
What do I think about the stability of the solution?
The solution is reliable and stable, it fits our requirements.
How was the initial setup?
The implementation of Cloudera Distribution for Hadoop is not easy. It works on multiple nodes and can be complex for testing. The whole process took us one and a half days.
What about the implementation team?
We used a local system integrator for the implementation. We had approximately five people for the implementation.
We have not had to do maintenance of the solution because we are still in the testing phase.
What other advice do I have?
My advice to others is this solution can be complex.
I rate Cloudera Distribution for Hadoop a seven out of ten.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Software Design Engineer at a marketing services firm with 501-1,000 employees
It automates the installation and configuration of Hadoop, but it should not provide generic logs for failed installations.
What is most valuable?
It automates the installation and configuration of Hadoop and different Big Data services.
What needs improvement?
We're currently trying to perform a failed installation and it's little bit difficult. It should restart the installation where it left off.
For how long have I used the solution?
I've used it for two years.
What was my experience with deployment of the solution?
- In some cases, logs are clear about failed services.
- While deploying in some failed steps it should not provide generic logs.
How are customer service and technical support?
7/10 - they have forums where they will answer your query within a day.
Which solution did I use previously and why did I switch?
We previously used Hortonworks and changed because Cloudera is simpler and more interactive.
How was the initial setup?
It was very straightforward.
What about the implementation team?
We did it in-house. They have good technical support to help with implementation.
What's my experience with pricing, setup cost, and licensing?
We use the free version, and they provide everything we need.
What other advice do I have?
Implement the free version as it provides enough services. If you want a backup service, or any extra service, then you can implement the enterprise version.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Buyer's Guide
Download our free Cloudera Distribution for Hadoop Report and get advice and tips from experienced pros
sharing their opinions.
Updated: February 2025
Popular Comparisons
Apache Spark
HPE Ezmeral Data Fabric
IBM Spectrum Computing
Hortonworks Data Platform
Buyer's Guide
Download our free Cloudera Distribution for Hadoop Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links
Learn More: Questions: