We use it to store data. Our team then takes this data to create reports on top of that.
We primarily use Kafka for intensive data streaming. For batch-based processing, we use Hadoop. Additionally, we have our own custom batch catalog that likely helps prepare data for further analysis or use.
We have many projects where our main data storage is done in Hadoop only. All projects take data from Hadoop to provide data insights and reports.
Hadoop YARN for resource management is a really good aspect. It is is very good for managing large data volumes. It allows us to monitor data processing effectively. We can see how much data there is, the consumption of RAM or ROM, and how resources are allocated. It's good for managing and previewing the scale of data processing.