I use Hadoop as a data lake in an AIML solution, where it connects to various data sources and ingests data into Hadoop. It is utilized for processing large data volumes with various data sources such as RDBMS, file systems, Kafka for real-time streaming data, IoT, web sockets, and API metadata.
We used the product primarily for data analysis and storage. It helps handle large data sets, performing tasks like filtering, sorting, and joining. The platform is useful for data warehousing and provides distributed coordination and synchronization functionalities.
Our use case is for a customer who wants to migrate their data warehouse to Hadoop. It's a request from a customer in Senegal who wants to migrate their Oracle data warehouse to Hadoop. I'm trying to migrate it to Hive or HBase. They're choosing between upgrading Oracle or moving to Cloudera Hadoop. They seem to prefer Cloudera. The current data warehouse runs on Oracle DB, but we have to migrate the analytics process to Hadoop.
I use the solution in my company since it makes the analytical processing easy. It takes data into one cluster and then processes it. While working on any GPU whatever the analytics are, and what I get as insight from the data, I can say that processing is very fast.
I use the solution in my company for security purposes. In my company, we have intranet portals that we need to ensure are not accessible by outsiders. All the data that are within the internal applications is only accessible with valid credentials within the domain. In general, my company uses Apache Hadoop to secure our internal applications.
We use the Hadoop File System. We usually keep the data for our tables or big data on it. Hadoop has a query engine called Hive. We write SQL queries, and the tool usually processes in a parallel environment and gets us the data on Hive.
I have been using the latest version of Apache Hadoop. It is a file system for data collection. There are nodes in this cluster that contain all the information, directories, and other files. The nodes are based on the MySQL database.
This solution is used for a variety of purposes, including managing enterprise data hubs, monitoring network quality, implementing an AntiFraud system, and establishing a conveyor system.
We use the Apache Hadoop environment for use cases involving big data engineering. We have many applications, such as collecting, transforming, loading, and storing lag event data for big organizations.
We use the solution as a data link for our customer payment and SaaS information. We get data from various sources and then utilize and leverage that data.
I'm from the data governance team, and this is how my team uses Apache Hadoop: there's a GUI called Apache Atlas, then there's an option called the "business glossary". My team uses the business glossary from Apache Atlas and also uses Apache Ranger. Apache Ranger is another GUI where you can check who is using which data source through the Apache Hadoop platform. My team also uses the Apache Hadoop platform for AI-related use cases and relevant data, so the data required from any kind of AI use case, that data is processed with ETL, specifically with the Talend tool. My team then loads the data in Apache Hadoop, uses that data by making some clusters, and uses the data for AI/ML cases.
Partner at a tech services company with 11-50 employees
Real User
2021-10-05T18:57:00Z
Oct 5, 2021
There are several use cases for Hadoop. Sometimes it's used for data warehousing. Other times, it's analytics. And In some cases, it's used to do transformation. For example, I have one client using it to decompress, compress, or encrypt data on ingestion. So, he used it like an ETL engine.
Founder & CTO at a tech services company with 1-10 employees
Real User
2020-12-08T22:10:56Z
Dec 8, 2020
We mainly use Apache Hadoop for real-time streaming. Real-time streaming and integration using Spark streaming and the ecosystem of Spark technologies inside Hadoop.
Vice President - Finance & IT at a consumer goods company with 1-10 employees
Real User
2020-07-14T08:15:56Z
Jul 14, 2020
As an example of a use case, when I was a contractor for Cisco, we were processing mobile network data and the volume was too big. RDBMS was not supporting anything. We started using the Hadoop framework to improve the process and get the results faster.
We are primarily dumping all the prior payment transaction data into a loop system and then we use some of the plug and play analytics tools to translate it.
The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing. The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect...
I use Hadoop as a data lake in an AIML solution, where it connects to various data sources and ingests data into Hadoop. It is utilized for processing large data volumes with various data sources such as RDBMS, file systems, Kafka for real-time streaming data, IoT, web sockets, and API metadata.
We used the product primarily for data analysis and storage. It helps handle large data sets, performing tasks like filtering, sorting, and joining. The platform is useful for data warehousing and provides distributed coordination and synchronization functionalities.
Our use case is for a customer who wants to migrate their data warehouse to Hadoop. It's a request from a customer in Senegal who wants to migrate their Oracle data warehouse to Hadoop. I'm trying to migrate it to Hive or HBase. They're choosing between upgrading Oracle or moving to Cloudera Hadoop. They seem to prefer Cloudera. The current data warehouse runs on Oracle DB, but we have to migrate the analytics process to Hadoop.
I use the solution in my company since it makes the analytical processing easy. It takes data into one cluster and then processes it. While working on any GPU whatever the analytics are, and what I get as insight from the data, I can say that processing is very fast.
The solution helps to store and retrieve information.
We use it to store data. Our team then takes this data to create reports on top of that.
I use the solution in my company for security purposes. In my company, we have intranet portals that we need to ensure are not accessible by outsiders. All the data that are within the internal applications is only accessible with valid credentials within the domain. In general, my company uses Apache Hadoop to secure our internal applications.
We use the Hadoop File System. We usually keep the data for our tables or big data on it. Hadoop has a query engine called Hive. We write SQL queries, and the tool usually processes in a parallel environment and gets us the data on Hive.
We work on Apache Hadoop for various customers.
I have been using the latest version of Apache Hadoop. It is a file system for data collection. There are nodes in this cluster that contain all the information, directories, and other files. The nodes are based on the MySQL database.
This solution is used for a variety of purposes, including managing enterprise data hubs, monitoring network quality, implementing an AntiFraud system, and establishing a conveyor system.
We use the Apache Hadoop environment for use cases involving big data engineering. We have many applications, such as collecting, transforming, loading, and storing lag event data for big organizations.
We use Apache Hadoop for analytics purposes.
We use the solution as a data link for our customer payment and SaaS information. We get data from various sources and then utilize and leverage that data.
We used Apache Hadoop mainly for ETL and data analysis.
I'm from the data governance team, and this is how my team uses Apache Hadoop: there's a GUI called Apache Atlas, then there's an option called the "business glossary". My team uses the business glossary from Apache Atlas and also uses Apache Ranger. Apache Ranger is another GUI where you can check who is using which data source through the Apache Hadoop platform. My team also uses the Apache Hadoop platform for AI-related use cases and relevant data, so the data required from any kind of AI use case, that data is processed with ETL, specifically with the Talend tool. My team then loads the data in Apache Hadoop, uses that data by making some clusters, and uses the data for AI/ML cases.
There are several use cases for Hadoop. Sometimes it's used for data warehousing. Other times, it's analytics. And In some cases, it's used to do transformation. For example, I have one client using it to decompress, compress, or encrypt data on ingestion. So, he used it like an ETL engine.
We mainly use Apache Hadoop for real-time streaming. Real-time streaming and integration using Spark streaming and the ecosystem of Spark technologies inside Hadoop.
As an example of a use case, when I was a contractor for Cisco, we were processing mobile network data and the volume was too big. RDBMS was not supporting anything. We started using the Hadoop framework to improve the process and get the results faster.
The primary use is as a data lake.
We are primarily dumping all the prior payment transaction data into a loop system and then we use some of the plug and play analytics tools to translate it.
We primarily use the solution for the enterprise data hub and big data warehouse extension.
The primary use case of this solution is data engineering and data files. The deployment model we are using is private, on-premises.
We primarily use this product to integrate legacy systems.
We use this solution for our Enterprise Data Lake.
We use it as a data lake for streaming analytical dashboards.