Amazon EMR offers scalability and cost-effectiveness, using auto-scaling and managed services for ease of use. It integrates seamlessly with Hadoop, HDFS, and multiple open-source platforms. Users appreciate its stability, reliability, and high processing efficiency with tools like Hive, Spark, and Flink. EMR provides robust data management with flexible cloud storage options. Its secure workflow management is valued, and the pricing is resource-based, supporting extensive data processing without hardware management overhead.
- "Amazon EMR has multiple connectors that can connect to various data sources."
- "The security of the managed workflow and the managed services are the best features for us. Since we inherited their security model and it's all managed services, those are the key benefits for our clients."
- "The solution helps us manage huge volumes of data."
Amazon EMR requires improved user interface, better integration with tools like Hive and Prometheus Grafana, and enhanced stability. Configuration complexity poses a challenge for users. Cost control and scalability need optimization. Initial startup is slow, and legacy version compatibility is problematic. Automation for cluster resizing and enhanced support are crucial. Improved monitoring, debugging, and web support are necessary. Users suggest more flexible features and improvements in CI/CD, MLOps, and data storage management.
- "Spark jobs take longer on Amazon EMR compared to previous experiences."
- "The solution can become expensive if you are not careful."
- "The product must add some of the latest technologies to provide more flexibility to the users."