We performed a comparison between Apache Hadoop and Snowflake based on real PeerSpot user reviews.
Find out in this report how the two Data Warehouse solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI."The most valuable features are powerful tools for ingestion, as data is in multiple systems."
"Its integration is Hadoop's best feature because that allows us to support different tools in a big data platform."
"It is a file system for data collection. There are nodes in this cluster that contain all the information, directories, and other files. The nodes are based on the MySQL database."
"I liked that Apache Hadoop was powerful, had a lot of tools, and the fact that it was free and community-developed."
"Hadoop is designed to be scalable, so I don't think that it has limitations in regards to scalability."
"The most valuable feature is scalability and the possibility to work with major information and open source capability."
"We selected Apache Hadoop because it is not dependent on third-party vendors."
"The scalability of Apache Hadoop is very good."
"Snowflake is an enormously useful platform. The Snowpipe feature is valuable because it allows us to load terabytes and petabytes of data into the data mart at a very low cost."
"It's ultra-fast at handling queries, which is what we find very convenient."
"The solution's customer service is good."
"As long as you don't need to worry about the storage or cost, this solution would be one of the best ones on the market for scalability purposes."
"It's user-friendly. It's SQL-driven. The fact that business can also go to this application and query because they know SQL is the biggest factor."
"The most valuable feature is the clone copy."
"Great scalability and near zero maintenance."
"The initial setup is straightforward. You just need to follow the documentation."
"General installation/dependency issues were there, but were not a major, complex issue. While migrating data from MySQL to Hive, things are a little challenging, but we were able to get through that with support from forums and a little trial and error."
"The solution is very expensive."
"What could be improved in Apache Hadoop is its user-friendliness. It's not that user-friendly, but maybe it's because I'm new to it. Sometimes it feels so tough to use, but it could be because of two aspects: one is my incompetency, for example, I don't know about all the features of Apache Hadoop, or maybe it's because of the limitations of the platform. For example, my team is maintaining the business glossary in Apache Atlas, but if you want to change any settings at the GUI level, an advanced level of coding or programming needs to be done in the back end, so it's not user-friendly."
"The main thing is the lack of community support. If you want to implement a new API or create a new file system, you won't find easy support."
"Real-time data processing is weak. This solution is very difficult to run and implement."
"The price could be better. I think we would use it more, but the company didn't want to pay for it. Hortonworks doesn't exist anymore, and Cloudera killed the free version of Hadoop."
"It needs better user interface (UI) functionalities."
"In the next release, I would like to see Hive more responsive for smaller queries and to reduce the latency."
"There are a lot of features that they need to come up with. A lot of functions are missing in Snowflake, so we have to find a workaround for those. For example, OUTER APPLY is a basic function in SQL Server, but it is not there in Snowflake. So, you have to write complex code for it."
"These days, they are pushing users towards the GUI or graphical version. However, I am more familiar with the classic version. I'd like to continue to work with it using the older approach."
"Product activation queries can't be changed while executing."
"I have heard people having difficulty with the machine learning model, so there may be room for improvement."
"It's difficult to know how to size everything correctly."
"Getting data out of the tool to third-party applications is difficult."
"If we can have a feature where the results can be moved to different tabs, so that I can compare the results with earlier queries before applying the changes, it would be great."
"There is a scope for improvement. They don't currently support integration with some of the Azure and AWS native services. It would be good if they can enhance their product to integrate with these services."
Apache Hadoop is ranked 5th in Data Warehouse with 34 reviews while Snowflake is ranked 1st in Data Warehouse with 94 reviews. Apache Hadoop is rated 7.8, while Snowflake is rated 8.4. The top reviewer of Apache Hadoop writes "Handles huge data volumes and create your own workflows and tables but you need to have deeper knowledge". On the other hand, the top reviewer of Snowflake writes "Good usability, good data sharing and elastic compute features, and requires less DBA involvement". Apache Hadoop is most compared with Azure Data Factory, Microsoft Azure Synapse Analytics, Oracle Exadata, Teradata and BigQuery, whereas Snowflake is most compared with BigQuery, Azure Data Factory, Teradata, Vertica and Teradata Cloud Data Warehouse. See our Apache Hadoop vs. Snowflake report.
See our list of best Data Warehouse vendors and best Cloud Data Warehouse vendors.
We monitor all Data Warehouse reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.
Apache Hadoop is for data lake use cases. But getting data out of Hadoop for meaningful analytics is indeed need quite an amount of work. by either using spark/Hive/presto and so on. The way i look at Snowflake and Hadoop is they complement each other. For data lake you can use hadoop and then for datawarehouse companies can use snowflake. Depending on the size of the company you can turn snowflake into a data lake use case too. Snowflake is SQL friendly and you don't need to carry out any circus to get the data in and out of snowflake.