We are a solution provider and this is one of the systems that we implement for our clients.
Our clients for this product are in the financial industry and they use it to perform cost analysis tasks.
We are a solution provider and this is one of the systems that we implement for our clients.
Our clients for this product are in the financial industry and they use it to perform cost analysis tasks.
The most valuable feature is Kubernetes.
The price of this solution could be lowered.
We have been using the Cloudera Distribution for Hadoop for five years.
It is a stable solution.
The Cloudera Distribution for Hadoop can be scaled. Our customers are enterprise-level companies and they have about 100 users for this solution.
We offer technical support for this solution to our customers.
We did not use another solution prior to this one.
The initial setup is straightforward.
The pricing is expensive.
Cloudera really has no competition.
I would rate this solution a nine out of ten.
Enterprise resource management, ease of use in terms of integration within the Hadoop ecosystem related products, and security.
Mainly they have to continuously evolve following the technology trends and replace or adapt part of their solutions accordingly.
We've used it since October 2012.
No issues encountered.
No issues encountered.
No issues encountered.
Pretty responsive and reactive compared to their competitors in the field.
It was extremely easy, and allowed less experienced personnel to get into the context pretty fast. Any difficulties/complexities faced were not related to the product itself rather than to the cluster infrastructure used.
In our case it was an in-house team including data scientists and data engineers (management & QA as well). With the appropriate training and the support offered by the vendor, it is not that hard to implement a small to medium scale project solution. However, complexity and size varies significantly between projects; therefore, it really depends.
That is not easy to answer since Huawei has several divisions using the product in different ways. Again regarding pricing/licensing highly depends on the context and the aims of the given organization for instance the level of support they are going to need, the type of services they are going to provide, or even the business domain they are targeting.
There were two provider solutions that have been evaluated. However, the level of customer service and technical support from Cloudera was better than the first one, and the second solution licence pricing was higher compared to Cloudera’s pricing schema.
Cloudera is doing a great job in the field offering an enterprise ready data platform. Based on my experiences I would definitely recommend it.
Our primary use case for this solution is to host a big amount of data in our platform, processing, analysis and all of this stuff on the platform.
Cloudera is always developing new tools and supports a wide range of tools. We also really like the Cloudera community. You can have any question and will have your answer within a few hours. Cloudera is better than other competitors because they acquired Hortonworks.
We're processing a huge amount of data on our system. Without the big data environment, we cannot store all of this data live. We have billions of records and terabytes of storage to be used. It's not an option actually for us to have a big data environment. Cloudera is trying to adopt new technologies.
I think the idea of open source tools now is dominating. So Cloudera has to decide how to deal with open-source tools. I subscribe to Cloudera to get an enterprise version but I have found that I can get some of its features from other vendors that would be at a lower cost than Cloudera. They should lower the price.
We have been using Cloudera for a year.
It's stable. I have no issue regarding the stability.
It's scalable. You can add more nodes and you can expand your cluster easily.
After we open a ticket, the issue can be resolved very quickly, they have a management portal. I don't contact them directly, but I haven't heard anybody having any problems with it.
The initial setup is complicated. We needed the vendor to install it themselves. The deployment took around three weeks. Three people were involved because they just follow up and supervise the deployment, but they're not deploying anything. The vendor does it.
In terms of the advice, I would say to focus on what tools are available on the market. In terms of open-source, most companies are delivering open source technologies and providing support to these tools. Now I have the option to purchase a license for whatever platform for $1. I can deliver it with another small company at a lower cost. If I was the decision-maker, I'd invest in open-source tools. Cloudera and all of these companies are trying to adapt to these big data technologies and open source tools. Cloudera is trying to put it inside their platform so that we can have a compatible solution.
I would rate it an eight out of ten.
Cloudera Manager is the most valuable feature for it’s ease of use, features, ease of upgrade and install components. CM can also be use to set up high availability within minutes. Others features like Hive, Pig, Impala, Flume and Spark are also valuable.
It's improved our storage and the availability of analytics tools such as Hive, Pig, Impala, and Spark helps us tremendously.
I'd like to see improvements to Impala. Also, it needs a more integrated environment with Spark, data warehouse, storage systems, cloud. Additionally, I'd want more UIs for components of ecosystem, preferably those UIs are centralized in a gateway.
I've used it for 3.5 years.
For experimental and production clusters alike, use Cloudera Manager right from the beginning. RPM installation is good for learning.
It has compatibility issues if installed in specialized hardware such as EMC Isilon or if node manager and data nodes are not co-located. For production, draw out a detailed plan on how to manage local repo for installation and upgrade. Never install from internet for production clusters.
Most of the clusters are for experimentation that don’t require support. For production clusters, implementations are through major vendors which are handled by them.
It depends on mode of installation. Cloudera Manager is always more straight forward and manageable. Avoid RPM installation as much as possible. Lay out plans with system admin on upgrade plan, commission and decommission nodes. Investigate impact and consequences of having HBase and Hadoop in the same cluster or as separate cluster, what are the impacts on system admin, cost, upgrades, data migrations, resources, etc?
The complexity kicks in when performing parameter configurations. Find out what are the use cases, are there disk IO or compution IO bound, are there lots of structured data or unstructured data for text analytics, etc.
Both vendor team and in-house depending on the cluster size and use cases. Some customers may require certain number of certified personnel, something to think about when choosing a partner.
Be prepared for fast changing landscape in how Hadoop works under the hood and how it is used. Each major release usually involved change of file system and data structure. How would they impact data migration. Ask questions like should they Upgrade or create a new cluster? Plans for training and skill upgrades.
We primarily use the solution for external storage.
The search function is the most valuable aspect of the solution.
The user infrastructure and user interface needs to be improved, as well as the performance. The GUI needs to be better.
We did not previously use a different solution.
The initial setup was complex, due to the user interface. We were doing a POC, so we're still doing the deployment.
I would rate this solution seven out of 10. There's tons of room for improvement.
It has enabled us to move BI out of our OLTP database and build a data warehouse.
Some areas are under rapid development, like Spark.
I've used it for three years.
No issues with the current version.
No issues with the current version.
No issues with the current version.
It's excellent.
Technical Support:It's excellent.
We switched because Cloudera just works.
Cloudera Manager greatly simplifies initial setup.
In-house.
Make sure you have clearly articulated, doable use cases before you start.
The features I find most valuable are--
Spark with R integration is missing. Also, it is lacking Spark SQL support.
I've used it for over eight months.
We faced issues in deploying Azure with Cloudera. Our machine hard disks were getting corrupted whenever we used to get patches on weekends. Now these have been resolved.
They offer excellent support.
It was complex because we were doing first time deployment of Cloudera on Azure. Also complexity was high due to lot of security features.
We are Big Data consultants, so we implement it.
Cloudera is a leader in providing distributions for Hadoop so it was no brainer for us to decide.
There were initial hiccups when deploying Cloudera on Azure but now this combo is working fine in production, so you can go for it.
We are in the testing phase of Cloudera Distribution for Hadoop, and we will be in production soon.
The procedure for operations could be simplified.
I have used Cloudera Distribution for Hadoop within the past 12 months.
The solution is reliable and stable, it fits our requirements.
The implementation of Cloudera Distribution for Hadoop is not easy. It works on multiple nodes and can be complex for testing. The whole process took us one and a half days.
We used a local system integrator for the implementation. We had approximately five people for the implementation.
We have not had to do maintenance of the solution because we are still in the testing phase.
My advice to others is this solution can be complex.
I rate Cloudera Distribution for Hadoop a seven out of ten.