We are a distributor for Hadoop. Our customers choose whether they would like to use Cloudera or another product.
Cloudera Distribution is deployed on-premise as well as on bare metal servers in AWS.
We are a distributor for Hadoop. Our customers choose whether they would like to use Cloudera or another product.
Cloudera Distribution is deployed on-premise as well as on bare metal servers in AWS.
Cloudera is a very manageable solution with good support.
When you compare Cloudera with EMR, EMR has a lot of administrative features, so you don't need to manage the solution. Cloudera is not as easy, as it requires more DevOps resources than other solutions.
We have been offering this solution for five years.
Cloudera Distribution is stable.
This is a scalable solution. We have clients that have a large installation of Cloudera.
Technical support from Cloudera is fine.
The initial setup of Cloudera is difficult. After you have installed it once, it is not difficult to reproduce.
For a POC deployment, we required only one DevOps. On larger-scale implementation, we also require a data engineer.
Cloudera requires a license to use.
We looked at EMR, however Cloudera is better when using OnPrem.
Cloudera is one of the best solutions for on-prem.
I would rate this solution an 8 out of 10.
Enterprise resource management, ease of use in terms of integration within the Hadoop ecosystem related products, and security.
Mainly they have to continuously evolve following the technology trends and replace or adapt part of their solutions accordingly.
We've used it since October 2012.
No issues encountered.
No issues encountered.
No issues encountered.
Pretty responsive and reactive compared to their competitors in the field.
It was extremely easy, and allowed less experienced personnel to get into the context pretty fast. Any difficulties/complexities faced were not related to the product itself rather than to the cluster infrastructure used.
In our case it was an in-house team including data scientists and data engineers (management & QA as well). With the appropriate training and the support offered by the vendor, it is not that hard to implement a small to medium scale project solution. However, complexity and size varies significantly between projects; therefore, it really depends.
That is not easy to answer since Huawei has several divisions using the product in different ways. Again regarding pricing/licensing highly depends on the context and the aims of the given organization for instance the level of support they are going to need, the type of services they are going to provide, or even the business domain they are targeting.
There were two provider solutions that have been evaluated. However, the level of customer service and technical support from Cloudera was better than the first one, and the second solution licence pricing was higher compared to Cloudera’s pricing schema.
Cloudera is doing a great job in the field offering an enterprise ready data platform. Based on my experiences I would definitely recommend it.
Very solid. Excellent user experience. good documentation. The Cloudera Manager is definitely a deal breaker. Packaging for Ubuntu is great for all the components.
Before the introduction of Cloudera Manager (that actually works), all the orchestration was done with scripts and Chef, and inexperienced team members had difficulties to participate in maintenance. The Cloudera Hadoop manager eased the work.
More customization, better documentation for the API (basically it's the same for all Cloudera Hadoop components).
I've used it for two years.
No issues encountered.
No issues encountered.
No issues encountered.
Didn't use dedicated service or support. The documentation is a bit of a mess, but it is decent and sufficient.
Straightforward. The CDH VirtualBox with preconfigured environment helps for demonstration purposes
We did it in-house.
We also looked at Hortonworks, but chose Cloudera because of my familiarity with it.
Do a comparisomn with Hortonworks as it's always good to compare to another major vendor
The most valuable feature for me are--
We used it to build an enterprise data hub.
Apache Kudu needs improvement. It's a real-time updatable database.
We used a vendor team to implement the solution.
We use the solution to maintain our legacy data warehouse for better performance and more extensive storage.
The solution's most valuable feature is the enterprise data platform.
They should work on the solution's pricing. Also, finding resources with good experience in the solution is difficult. Thus, they should upgrade their technical capabilities in the market.
They should add features like AutoML and AutoDev for enhanced machine-learning experiences. In addition, they should consider developing an integration capability similar to Informatica for an end-to-end enterprise solution.
We have been using the solution for one year.
The solution's customer support team could be better. We received their assistance only with installation and configuration.
The solution is expensive. The license costs around 10k.
Cloudera is a cost-effective solution if you need more storage space. In this case, I advise you to opt for it. I rate the solution as an eight out of ten.
I've been working on the software installation from the beginning, and we have a client for global supply change, so we get information from Telefonica's sales and distributions. Getting all that information into this system allows us to process it, get KPIs, and create outgoing information for business intelligence tools.
In the cloud provider enterprise we get all the information from the gamers, like delays, response, and information from the games. It allows us to see if gamers are having trouble, high latency or any other kind of issue. They test that and get information about the issues in order to solve them.
I like the combination of all the tools that allow me to provide solutions and enable me to solve the use cases I'm working on. You need tools or components to foresee everything, and they are all in our emails. Sometimes you try several of them, and sometimes one will work better than the other. So you have to test the tools to see what works for you.
We experienced many issues when we started working with Hadoop 3.0 in the Cloudera 6.0 version, so there are a lot of things that need to improve. I believe they are working on that.
It's been quite easy to install. We only had to follow the instructions and there weren't many problems. That's important for us.
I will rate this solution a nine out of ten because nothing is ever perfect. You will always face problems, but I'm quite happy with Cloudera.
Cloudera Manager is the most valuable feature for it’s ease of use, features, ease of upgrade and install components. CM can also be use to set up high availability within minutes. Others features like Hive, Pig, Impala, Flume and Spark are also valuable.
It's improved our storage and the availability of analytics tools such as Hive, Pig, Impala, and Spark helps us tremendously.
I'd like to see improvements to Impala. Also, it needs a more integrated environment with Spark, data warehouse, storage systems, cloud. Additionally, I'd want more UIs for components of ecosystem, preferably those UIs are centralized in a gateway.
I've used it for 3.5 years.
For experimental and production clusters alike, use Cloudera Manager right from the beginning. RPM installation is good for learning.
It has compatibility issues if installed in specialized hardware such as EMC Isilon or if node manager and data nodes are not co-located. For production, draw out a detailed plan on how to manage local repo for installation and upgrade. Never install from internet for production clusters.
Most of the clusters are for experimentation that don’t require support. For production clusters, implementations are through major vendors which are handled by them.
It depends on mode of installation. Cloudera Manager is always more straight forward and manageable. Avoid RPM installation as much as possible. Lay out plans with system admin on upgrade plan, commission and decommission nodes. Investigate impact and consequences of having HBase and Hadoop in the same cluster or as separate cluster, what are the impacts on system admin, cost, upgrades, data migrations, resources, etc?
The complexity kicks in when performing parameter configurations. Find out what are the use cases, are there disk IO or compution IO bound, are there lots of structured data or unstructured data for text analytics, etc.
Both vendor team and in-house depending on the cluster size and use cases. Some customers may require certain number of certified personnel, something to think about when choosing a partner.
Be prepared for fast changing landscape in how Hadoop works under the hood and how it is used. Each major release usually involved change of file system and data structure. How would they impact data migration. Ask questions like should they Upgrade or create a new cluster? Plans for training and skill upgrades.
The features we've found most valuable are--
We were able to utilize data which was untapped previously. We've got great use cases now to drive business revenue.
It needs more standardized documentation on Hive.
I've used it for two and a half years.
It's great.
Technical Support:The level of technical support is great.
No previous solution was used, and senior management chose to bring it in.
I was not directly involved in deployment.
It was done by the vendor team, who were great.
It's good for Big Data analytics.