Apache Hadoop vs Snowflake Comparison 2024

Apache Hadoop

Snowflake

Apache Hadoop

Read 34 Apache Hadoop reviews

2,467 views|2,110 comparisons

Snowflake

Read 94 Snowflake reviews

11,866 views|6,742 comparisons

Comparison Buyer's Guide

Download the complete report

Buyer's Guide

Apache Hadoop vs. Snowflake

May 2024

Executive Summary

We performed a comparison between Apache Hadoop and Snowflake based on real PeerSpot user reviews.

Find out in this report how the two Data Warehouse solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI.

To learn more, read our detailed Apache Hadoop vs. Snowflake Report (Updated: May 2024).

Download the complete report

772,649 professionals have used our research since 2012.

Q&A Highlights

Question: What is the biggest difference between Apache Hadoop and Snowflake?

Answer: Interactive querying as a consumption pattern is something Snowflake handles much better than Hadoop and related query engine options - Impala, Presto, Drill etc. Heavy data scientists query workload can be an expensive query pattern on Snowflake and Hadoop can provide a more cost-efficient solution. Hadoop is also still relevant as a back-end data processing engine, instead of leveraging Snowflake for data transformation due to higher cost as well as limited procedural language capabilities (javascript based stored procedures). Snowflake fares much better than Hadoop in terms of administrative complexity.

Featured Review

Akhilesh Chipre

Senior Assosiate Consultant at Applied Materials

Handles huge data volumes and create your own workflows and tables but you need to have deeper knowledge

We primarily use Kafka for intensive data streaming. For batch-based processing, we use Hadoop. Additionally, we have our own custom batch catalog... Read more →

Anthony Fiorino

SVP, Head of Enterprise Data Mgmt & Data Intelligence at a financial services firm

An entirely automated solution that decreased our time to market with fantastic customer support

The core feature of the platform is everything works, and that's what I like about it. Our time to market is faster, it requires less maintenance,... Read more →

Quotes From Members

We asked business professionals to review the solutions they use.
Here are some excerpts of what they said:

Pros

"The most valuable features are powerful tools for ingestion, as data is in multiple systems.""Its integration is Hadoop's best feature because that allows us to support different tools in a big data platform.""It is a file system for data collection. There are nodes in this cluster that contain all the information, directories, and other files. The nodes are based on the MySQL database.""I liked that Apache Hadoop was powerful, had a lot of tools, and the fact that it was free and community-developed.""Hadoop is designed to be scalable, so I don't think that it has limitations in regards to scalability.""The most valuable feature is scalability and the possibility to work with major information and open source capability.""We selected Apache Hadoop because it is not dependent on third-party vendors.""The scalability of Apache Hadoop is very good."

More Apache Hadoop Pros →

"Snowflake is an enormously useful platform. The Snowpipe feature is valuable because it allows us to load terabytes and petabytes of data into the data mart at a very low cost.""It's ultra-fast at handling queries, which is what we find very convenient.""The solution's customer service is good.""As long as you don't need to worry about the storage or cost, this solution would be one of the best ones on the market for scalability purposes.""It's user-friendly. It's SQL-driven. The fact that business can also go to this application and query because they know SQL is the biggest factor.""The most valuable feature is the clone copy.""Great scalability and near zero maintenance.""The initial setup is straightforward. You just need to follow the documentation."

More Snowflake Pros →

Cons

"General installation/dependency issues were there, but were not a major, complex issue. While migrating data from MySQL to Hive, things are a little challenging, but we were able to get through that with support from forums and a little trial and error.""The solution is very expensive.""What could be improved in Apache Hadoop is its user-friendliness. It's not that user-friendly, but maybe it's because I'm new to it. Sometimes it feels so tough to use, but it could be because of two aspects: one is my incompetency, for example, I don't know about all the features of Apache Hadoop, or maybe it's because of the limitations of the platform. For example, my team is maintaining the business glossary in Apache Atlas, but if you want to change any settings at the GUI level, an advanced level of coding or programming needs to be done in the back end, so it's not user-friendly.""The main thing is the lack of community support. If you want to implement a new API or create a new file system, you won't find easy support.""Real-time data processing is weak. This solution is very difficult to run and implement.""The price could be better. I think we would use it more, but the company didn't want to pay for it. Hortonworks doesn't exist anymore, and Cloudera killed the free version of Hadoop.""It needs better user interface (UI) functionalities.""In the next release, I would like to see Hive more responsive for smaller queries and to reduce the latency."

More Apache Hadoop Cons →

"There are a lot of features that they need to come up with. A lot of functions are missing in Snowflake, so we have to find a workaround for those. For example, OUTER APPLY is a basic function in SQL Server, but it is not there in Snowflake. So, you have to write complex code for it.""These days, they are pushing users towards the GUI or graphical version. However, I am more familiar with the classic version. I'd like to continue to work with it using the older approach.""Product activation queries can't be changed while executing.""I have heard people having difficulty with the machine learning model, so there may be room for improvement.""It's difficult to know how to size everything correctly.""Getting data out of the tool to third-party applications is difficult.""If we can have a feature where the results can be moved to different tabs, so that I can compare the results with earlier queries before applying the changes, it would be great.""There is a scope for improvement. They don't currently support integration with some of the Azure and AWS native services. It would be good if they can enhance their product to integrate with these services."

More Snowflake Cons →

Pricing and Cost Advice

"Do take into consider that data storage and compute capacity scale differently and hence purchasing a "boxed" / 'all-in-one" solution (software and hardware) might not be the best idea."

"There are no licensing costs involved, hence money is saved on the software infrastructure."

"This is a low cost and powerful solution."

"The price of Apache Hadoop could be less expensive."

"If my company can use the cloud version of Apache Hadoop, particularly the cloud storage feature, it would be easier and would cost less because an on-premises deployment has a higher cost during storage, for example, though I don't know exactly how much Apache Hadoop costs."

"We don't directly pay for it. Our clients pay for it, and they usually don't complain about the price. So, it is probably acceptable."

"The price could be better. Hortonworks no longer exists, and Cloudera killed the free version of Hadoop."

"We just use the free version."

More Apache Hadoop Pricing and Cost Advice →

"Pricing can be confusing for customers."

"The whole licensing system is based on credit points. You can also make a license agreement with the company so that you buy credit points and then you use them. What you do not use in one year can be carried over to the next year."

"You pay based on the data that you are storing in the data warehouse and there are no maintenance costs."

"It is not cheap."

"The pricing for Snowflake is competitive."

"On average, with the number of queries that we run, we pay approximately $200 USD per month."

"Pricing is approximately $US 50 per DB. Terabyte is around $US 50 per month."

"The price of Snowflake is very reasonable."

More Snowflake Pricing and Cost Advice →

See Which Vendors Are Best For You

Use our free recommendation engine to learn which Data Warehouse solutions are best for your needs.

See Recommendations

772,649 professionals have used our research since 2012.

Answers from the Community

Miriam Tover

Service Delivery Manager at PeerSpot

Question: What is the biggest difference between Apache Hadoop and Snowflake?

it_user1274238 (Director at a tech services company with 10,001+ employees)

Apache Hadoop is for data lake use cases. But getting data out of Hadoop for meaningful analytics is indeed need quite an amount of work. by either using spark/Hive/presto and so on. The way i look at Snowflake and Hadoop is they complement each other. For data lake you can use hadoop and then for datawarehouse companies can use snowflake. Depending on the size of the company you can turn snowflake into a data lake use case too. Snowflake is SQL friendly and you don't need to carry out any circus to get the data in and out of snowflake.

Feb 3, 2020

See all 2 answers »

Questions from the Community

What do you like most about Apache Hadoop?

Top Answer:It's primarily open source. You can handle huge data volumes and create your own views, workflows, and tables. I can also use it for real-time data streaming.

Read all 25 answers →

What is your experience regarding pricing and costs for Apache Hadoop?

Top Answer:We just use the free version.

Read all 8 answers →

What needs improvement with Apache Hadoop?

Top Answer:Since it is an open-source product, there won't be much support. So, you have to have deeper knowledge. You need to improvise based on that.

Read all 25 answers →

What do you like most about Snowflake?

Top Answer:The best thing about Snowflake is its flexibility in changing warehouse sizes or computational power.

Read all 64 answers →

What is your experience regarding pricing and costs for Snowflake?

Top Answer:Snowflake is a cost-effective solution.

Read all 46 answers →

What needs improvement with Snowflake?

Top Answer:The real-time streaming feature is limited with Snowflake and could be improved. Currently, Snowflake doesn't support unstructured data. With Snowflake, you need to be very particular about the type… more »

Read all 62 answers →

Ranking

5th

out of 35 in Data Warehouse

Views

2,467

Comparisons

2,110

Reviews

Average Words per Review

563

Rating

7.9

1st

out of 35 in Data Warehouse

Views

11,866

Comparisons

6,742

Reviews

Average Words per Review

459

Rating

8.3

Comparisons

Azure Data Factory vs. Apache Hadoop

Compared 19% of the time.

Microsoft Azure Synapse Analytics vs. Apache Hadoop

Compared 16% of the time.

Oracle Exadata vs. Apache Hadoop

Compared 12% of the time.

Teradata vs. Apache Hadoop

Compared 8% of the time.

BigQuery vs. Apache Hadoop

Compared 7% of the time.

More Apache Hadoop Competitors →

BigQuery vs. Snowflake

Compared 15% of the time.

Azure Data Factory vs. Snowflake

Compared 14% of the time.

Teradata vs. Snowflake

Compared 8% of the time.

Vertica vs. Snowflake

Compared 6% of the time.

Teradata Cloud Data Warehouse vs. Snowflake

Compared 1% of the time.

More Snowflake Competitors →

Also Known As

Snowflake Computing

Learn More

Apache

Snowflake Computing

Overview

The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing. The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.

Snowflake is a cloud-based data warehousing solution for storing and processing data, generating reports and dashboards, and as a BI reporting source. It is used for optimizing costs and using financial data, as well as for migrating data from on-premises to the cloud. The solution is often used as a centralized data warehouse, combining data from multiple sources.

Snowflake has helped organizations improve query performance, store and process JSON and XML, consolidate multiple databases into one unified table, power company-wide dashboards, increase productivity, reduce processing time, and have easy maintenance with good technical support.

Its platform is made up of three components:

Cloud services - Snowflake uses ANSI SQL to empower users to optimize their data and manage their infrastructure, while Snowflake handles the security and encryption of stored data.
Query processing - Snowflake's compute layer is made up of virtual cloud data warehouses that let you analyze data through requests. Each of the warehouses does not compete for computing resources, nor do they affect the performance of each other.
Database storage - Snowflake automatically manages all parts of the data storage process, including file size, compression, organization, structure, metadata, and statistics.

Snowflake has many valuable vital features. Some of the most useful ones include:

Snowflake architecture provides nearly unlimited scalability and high speed because it uses a single elastic performance engine. The solution also supports unlimited concurrent users and workloads, from interactive to batch.
Snowflake makes automation easy and enables enterprises to automate data management, security, governance, availability, and data resiliency.
With seamless cross-cloud and cross-region connections, Snowflake eliminates ETL and data silos. Anyone who needs access to shared secure data can get a single copy via the data cloud. In addition, Snowflake makes remote collaboration and decision-making fast and easy via a single shared data source.
Snowflake’s Data Marketplace offers third-party data, which allows you to connect with Snowflake customers to extend workflows with data services and third-party applications.

There are many benefits to implementing Snowflake. It helps optimize costs, reduce downtime, improve operational efficiency, and automate data replication for fast recovery, and it is built for high reliability and availability.

Below are quotes from interviews we conducted with users currently using the Snowflake solution:

Sreenivasan R., Director of Data Architecture and Engineering at Decision Minds, says, "Data sharing is a good feature. It is a majorly used feature. The elastic computing is another big feature. Separating computing and storage gives you flexibility. It doesn't require much DBA involvement because it doesn't need any performance tuning. We are not doing any performance tuning, and the entire burden of performance and SQL tuning is on Snowflake. Its usability is very good. I don't need to ramp up any user, and its onboarding is easier. You just onboard the user, and you are done with it. There are simple SQL and UI, and people are able to use this solution easily. Ease of use is a big thing in Snowflake."

A director of business operations at a logistics company mentions, "It requires no maintenance on our part. They handle all that. The speed is phenomenal. The pricing isn't really anything more than what you would be paying for a SQL server license or another tool to execute the same thing. We have zero maintenance on our side to do anything and the speed at which it performs queries and loads the data is amazing. It handles unstructured data extremely well, too. So, if the data is in a JSON array or an XML, it handles that super well."

A Solution Architect at a wholesaler/distributor comments, "The ability to share the data and the ability to scale up and down easily are the most valuable features. The concept of data sharing and data plumbing made it very easy to provide and share data. The ability to refresh your Dev or QA just by doing a clone is also valuable. It has the dynamic scale up and scale down feature. Development and deployment are much easier as compared to other platforms where you have to go through a lot of stuff. With a tool like DBT, you can do modeling and transformation within a single tool and deploy to Snowflake. It provides continuous deployment and continuous integration abilities. There is a separation of storage and compute, so you only get charged for your usage. You only pay for what you use. When we share the data downstream with business partners, we can specifically create compute for them, and we can charge back the business."

Sample Customers

Amazon, Adobe, eBay, Facebook, Google, Hulu, IBM, LinkedIn, Microsoft, Spotify, AOL, Twitter, University of Maryland, Yahoo!, Cornell University Web Lab

Accordant Media, Adobe, Kixeye Inc., Revana, SOASTA, White Ops

Top Industries

REVIEWERS

Financial Services Firm35%

Comms Service Provider24%

Hospitality Company6%

Consumer Goods Company6%

VISITORS READING REVIEWS

Financial Services Firm29%

Computer Software Company11%

University6%

Manufacturing Company5%

REVIEWERS

Computer Software Company30%

Financial Services Firm20%

Healthcare Company6%

Manufacturing Company6%

VISITORS READING REVIEWS

Educational Organization27%

Financial Services Firm13%

Computer Software Company10%

Manufacturing Company6%

Company Size

REVIEWERS

Small Business33%

Midsize Enterprise19%

Large Enterprise47%

VISITORS READING REVIEWS

Small Business15%

Midsize Enterprise11%

Large Enterprise74%

REVIEWERS

Small Business26%

Midsize Enterprise21%

Large Enterprise54%

VISITORS READING REVIEWS

Small Business15%

Midsize Enterprise35%

Large Enterprise51%

Buyer's Guide

Apache Hadoop vs. Snowflake

May 2024

Free Report: Apache Hadoop vs. Snowflake

Find out what your peers are saying about Apache Hadoop vs. Snowflake and other solutions. Updated: May 2024.

DOWNLOAD NOW

772,649 professionals have used our research since 2012.

Apache Hadoop is ranked 5th in Data Warehouse with 34 reviews while Snowflake is ranked 1st in Data Warehouse with 94 reviews. Apache Hadoop is rated 7.8, while Snowflake is rated 8.4. The top reviewer of Apache Hadoop writes "Handles huge data volumes and create your own workflows and tables but you need to have deeper knowledge". On the other hand, the top reviewer of Snowflake writes "Good usability, good data sharing and elastic compute features, and requires less DBA involvement". Apache Hadoop is most compared with Azure Data Factory, Microsoft Azure Synapse Analytics, Oracle Exadata, Teradata and BigQuery, whereas Snowflake is most compared with BigQuery, Azure Data Factory, Teradata, Vertica and Teradata Cloud Data Warehouse. See our Apache Hadoop vs. Snowflake report.

See our list of best Data Warehouse vendors and best Cloud Data Warehouse vendors.

We monitor all Data Warehouse reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.

Apache Hadoop vs Snowflake comparison