We use the solution for workflow distribution. It's an ETL for real-time and batch-mode processing. It's mainly used for all the stuff, including data warehousing.
Senior Business Development Manager at BBI Consultancy
Real User
Top 10
2024-03-21T10:28:21Z
Mar 21, 2024
The tool is used by our company's different customers who have requirements for big data management. When our company's customers want to build a platform for big data management, they choose Cloudera as their tool and as a big data management platform even though there are different options in the market since it is best suited if they consider having an on-premises solution. If a customer wants a cloud-based solution for big data management, then there are other tools in the market that better suit their requirements. For an on-premises big data management platform, Cloudera is the best choice.
There are multiple use cases of Cloudera. It is a big data platform where we collect all the data and connect other sources to get data from multiple sources. Cloudera has a Data Lake.
Learn what your peers think about Cloudera Distribution for Hadoop. Get advice and tips from experienced pros sharing their opinions. Updated: January 2025.
Head of Big Data and Analytics Competency center at OTP Bank Hungary
Real User
2022-11-04T13:34:09Z
Nov 4, 2022
We use this solution as a data lake, pre-processing the large amount of data we have for further consumption by relational databases or advanced analytics. We use HDFS and Spark for that purpose and we are using Cloudera Machine Learning, a Jupyter Notebook-like environment with model monitoring opportunities, model catalog, and things like that. We are customers of Cloudera and I'm head of big data and the analytics competency center.
This product is a framework for edge AI, it comes with multiple ecosystems as a project. I'm a senior data architect manager and we are consultants. We offer Cloudera to our customers but we don't have a partnership with them.
Associate Manager at a consultancy with 501-1,000 employees
Real User
2021-03-09T16:58:10Z
Mar 9, 2021
We use this solution to process data. When using an SQL Server you have to build indexes and you need to fine-tune the data. We import the data that is in the SQL Source. With a single script, we are able to run the jobs within minutes, which is an advantage. We are using the Power BI model for the business convention. The performance in Power BI will be reduced if you incorporate more calculations. Those calculations are captured in the Hadoop layer and processed.
We are a solution provider and this is one of the systems that we implement for our clients. Our clients for this product are in the financial industry and they use it to perform cost analysis tasks.
We are dealing with data from the telecom industry. We were using an Oracle system but our volume has increased. We now have a lot of real-time data that needs to be transformed so that it can be made available and used.
DBA team manager at a financial services firm with 1,001-5,000 employees
Real User
2019-07-16T05:40:00Z
Jul 16, 2019
I'm part of the IT team at my company, and our primary use case of this solution is building infrastructure for advanced analytics, where we copy data from our data warehouse that is now our relational database. We copy it to the Cloudera Distribution for Hadoop and then analyze it with Python and machine learning.
Senior Consultant & Training at a tech services company with 51-200 employees
Consultant
2019-07-16T05:40:00Z
Jul 16, 2019
I've been working on the software installation from the beginning, and we have a client for global supply change, so we get information from Telefonica's sales and distributions. Getting all that information into this system allows us to process it, get KPIs, and create outgoing information for business intelligence tools. In the cloud provider enterprise we get all the information from the gamers, like delays, response, and information from the games. It allows us to see if gamers are having trouble, high latency or any other kind of issue. They test that and get information about the issues in order to solve them.
Lead Consultant - Product Development at FIS (http://www.fisglobal.com/)
Real User
2019-01-23T17:11:00Z
Jan 23, 2019
Our core product is an insurance product and the actuarial module is quite complex. SMEs so far collect data from various sources into Excel sheets and through macros do the analytics which is a very crude form of doing the analysis. So we thought to use big data for such analysis.
Cloudera Distribution for Hadoop is the world's most complete, tested, and popular distribution of Apache Hadoop and related projects. CDH is 100% Apache-licensed open source and is the only Hadoop solution to offer unified batch processing, interactive SQL, and interactive search, and role-based access controls. More enterprises have downloaded CDH than all other such distributions combined.
We use the solution for workflow distribution. It's an ETL for real-time and batch-mode processing. It's mainly used for all the stuff, including data warehousing.
The tool is used by our company's different customers who have requirements for big data management. When our company's customers want to build a platform for big data management, they choose Cloudera as their tool and as a big data management platform even though there are different options in the market since it is best suited if they consider having an on-premises solution. If a customer wants a cloud-based solution for big data management, then there are other tools in the market that better suit their requirements. For an on-premises big data management platform, Cloudera is the best choice.
There are multiple use cases of Cloudera. It is a big data platform where we collect all the data and connect other sources to get data from multiple sources. Cloudera has a Data Lake.
We share company data leaks based on cloud data on their clusters.
We use it for machine learning.
I use the solution because my data is too big. It is almost 100 TB.
Cloudera Distribution for Hadoop is used for our data lake and big data solutions.
We use the solution to maintain our legacy data warehouse for better performance and more extensive storage.
We used this solution as a data platform.
We use this solution as a data lake, pre-processing the large amount of data we have for further consumption by relational databases or advanced analytics. We use HDFS and Spark for that purpose and we are using Cloudera Machine Learning, a Jupyter Notebook-like environment with model monitoring opportunities, model catalog, and things like that. We are customers of Cloudera and I'm head of big data and the analytics competency center.
I primarily use CDH for data storage and regular dashboard reports.
In my previous organization, we used Cloudera Distribution for Hadoop for compiling website logs and application logs. We used it for log analytics.
This product is a framework for edge AI, it comes with multiple ecosystems as a project. I'm a senior data architect manager and we are consultants. We offer Cloudera to our customers but we don't have a partnership with them.
We are in the testing phase of Cloudera Distribution for Hadoop, and we will be in production soon.
We use Cloudera Distribution for file storage. This solution is deployed on-premise.
We use this solution to process data. When using an SQL Server you have to build indexes and you need to fine-tune the data. We import the data that is in the SQL Source. With a single script, we are able to run the jobs within minutes, which is an advantage. We are using the Power BI model for the business convention. The performance in Power BI will be reduced if you incorporate more calculations. Those calculations are captured in the Hadoop layer and processed.
We use the solution for the data warehousing.
We are using this solution for storing Big Data in one centralized location.
We are a solution provider and this is one of the systems that we implement for our clients. Our clients for this product are in the financial industry and they use it to perform cost analysis tasks.
Our primary use case for this solution is to host a big amount of data in our platform, processing, analysis and all of this stuff on the platform.
We are dealing with data from the telecom industry. We were using an Oracle system but our volume has increased. We now have a lot of real-time data that needs to be transformed so that it can be made available and used.
I'm part of the IT team at my company, and our primary use case of this solution is building infrastructure for advanced analytics, where we copy data from our data warehouse that is now our relational database. We copy it to the Cloudera Distribution for Hadoop and then analyze it with Python and machine learning.
I've been working on the software installation from the beginning, and we have a client for global supply change, so we get information from Telefonica's sales and distributions. Getting all that information into this system allows us to process it, get KPIs, and create outgoing information for business intelligence tools. In the cloud provider enterprise we get all the information from the gamers, like delays, response, and information from the games. It allows us to see if gamers are having trouble, high latency or any other kind of issue. They test that and get information about the issues in order to solve them.
We primarily use it only for big data support for analytical applications.
We make recommendations to clients for using different models of this solution to handle data intelligently.
We primarily use the solution for external storage.
Our core product is an insurance product and the actuarial module is quite complex. SMEs so far collect data from various sources into Excel sheets and through macros do the analytics which is a very crude form of doing the analysis. So we thought to use big data for such analysis.