Try our new research platform with insights from 80,000+ expert users

Apache Spark vs SAP HANA comparison

 

Comparison Buyer's Guide

Executive Summary

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Categories and Ranking

Apache Spark
Average Rating
8.4
Reviews Sentiment
7.7
Number of Reviews
65
Ranking in other categories
Hadoop (1st), Compute Service (3rd), Java Frameworks (2nd)
SAP HANA
Average Rating
8.2
Reviews Sentiment
6.5
Number of Reviews
84
Ranking in other categories
Data Virtualization (2nd), Embedded Database (4th), Relational Databases Tools (4th)
 

Featured Reviews

Ilya Afanasyev - PeerSpot reviewer
Reliable, able to expand, and handle large amounts of data well
We use batch processing. It works well with our formats and file versions. There's a lot of functionality. In our pipeline each hour, we make a copy of data from MongoDB, of the changes from MongoDB to some specific file. Each time pipeline copied all of the data, it would do it each time without changes to all of the tables. Tables have a lot of data, and in the last MongoDB version, there is a possibility to read only changed data. This reduced the cost and configuration of the cluster, and we saved about $150,000. The solution is scalable. It's a stable product.
Jayarami Reddy Pujeri - PeerSpot reviewer
Comprehensive system with real-time analytics for versatile industry applications
Our primary use case is working with various clients in industries such as pharmaceuticals and other services. We support clients as implementers of SAP HANA, providing expertise in functionality, finance, logistics, and processes The solution is very user-friendly and supports all kinds of…

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"The most crucial feature for us is the streaming capability. It serves as a fundamental aspect that allows us to exert control over our operations."
"The good performance. The nice graphical management console. The long list of ML algorithms."
"It provides a scalable machine learning library."
"The features we find most valuable are the machine learning, data learning, and Spark Analytics."
"The scalability has been the most valuable aspect of the solution."
"The processing time is very much improved over the data warehouse solution that we were using."
"The product's initial setup phase was easy."
"Spark is used for transformations from large volumes of data, and it is usefully distributed."
"The data storage requirement is reduced from the original database to the HANA database."
"The feature that I like the most is that we can transport the data to our web data application. SAP HANA's performance is really perfect. We're working on big data, and SAP HANA is really working on high performance. We are happy working with it."
"Some functions have good performance."
"We like that the product is both vertically and horizontally scalable, allowing us to do around 86 percent compression of documentation from 50 to seven terabytes."
"It's user-friendly so long as you use it frequently."
"SAP HANA is one of the best databases known for its performance...The new version of the solution is stable."
"The in-memory computing and the efficient response time are very good features."
"The memory is the solution's most valuable feature. It's the main feature of HANA. Others are still the regular IT databases that are on storage and are therefore much slower than HANA. The solution is quite fast."
 

Cons

"Apache Spark can improve the use case scenarios from the website. There is not any information on how you can use the solution across the relational databases toward multiple databases."
"We've had problems using a Python process to try to access something in a large volume of data. It crashes if somebody gives me the wrong code because it cannot handle a large volume of data."
"I know there is always discussion about which language to write applications in and some people do love Scala. However, I don't like it."
"One limitation is that not all machine learning libraries and models support it."
"In data analysis, you need to take real-time data from different data sources. You need to process this in a subsecond, do the transformation in a subsecond, and all that."
"When using Spark, users may need to write their own parallelization logic, which requires additional effort and expertise."
"Spark could be improved by adding support for other open-source storage layers than Delta Lake."
"When you are working with large, complex tasks, the garbage collection process is slow and affects performance."
"The challenge right now is all databases are on S4 HANA architecture. You're running it for HANA, but not all the functionalities are available. If they can speed up getting all the databases on S4 HANA that would help."
"The high price of the product is an area of concern where improvements are required."
"The solution needs to work on its performance and make it faster."
"I don't have direct access to SAP, and instead, I need to go through the SAP office in India."
"Unlike other databases, it lacks management features that legacy databases like Oracle or SQL servers have. They need to make the solution easier to manage and offer tools that make management more effective. A lot of things you have on traditional databases you have to develop into HANA."
"I think that the pricing is high and it needs improvement."
"Technical support should be more customer-friendly."
"Uses a large amount of RAM and is costly."
 

Pricing and Cost Advice

"Spark is an open-source solution, so there are no licensing costs."
"On the cloud model can be expensive as it requires substantial resources for implementation, covering on-premises hardware, memory, and licensing."
"It is an open-source platform. We do not pay for its subscription."
"It is quite expensive. In fact, it accounts for almost 50% of the cost of our entire project."
"They provide an open-source license for the on-premise version."
"The tool is an open-source product. If you're using the open-source Apache Spark, no fees are involved at any time. Charges only come into play when using it with other services like Databricks."
"Since we are using the Apache Spark version, not the data bricks version, it is an Apache license version, the support and resolution of the bug are actually late or delayed. The Apache license is free."
"The product is expensive, considering the setup."
"The price of this product is good."
"There is an annual payment needed to use the solution."
"A monthly or yearly license must be purchased, although its utility will be based on the cost-benefit analysis that is reached by the individual customer."
"The pricing for SAP HANA is high. You pay a lot for the license, and you also have to pay for some add-ons."
"There is a yearly subscription to use SAP HANA. There is a fee for maintenance."
"There is an annual license to use SAP HANA."
"It comes with a significant cost."
"The pricing is relatively high for both customers and partners."
report
Use our free recommendation engine to learn which Hadoop solutions are best for your needs.
837,501 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
27%
Computer Software Company
13%
Manufacturing Company
7%
Comms Service Provider
5%
Manufacturing Company
15%
Computer Software Company
12%
Financial Services Firm
10%
Government
7%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

What do you like most about Apache Spark?
We use Spark to process data from different data sources.
What is your experience regarding pricing and costs for Apache Spark?
Compared to other solutions like Doc DB, Spark is more costly due to the need for extensive infrastructure. It requires significant investment in infrastructure, which can be expensive. While cloud...
What needs improvement with Apache Spark?
The Spark solution could improve in scheduling tasks and managing dependencies. Spark alone cannot handle sequential tasks, requiring environments like Airflow scheduler or scripts. For instance, o...
What are the biggest benefits of using SAP HANA?
Based on my work with SAP HANA, the biggest benefit that it can bring to your business is total data management. This product is by SAP - a company that serves almost all needs a client may have co...
Is SAP HANA’s customer and technical support reliable?
We have been using SAP HANA for a fairly short period of time and have only taken advantage of their customer support. So far, we have not had issues that required specialized help from technical s...
Is SAP HANA difficult to set up and start using?
SAP HANA is fairly easy to set up, however, I do not think a complete beginner can do it. You certainly need some preparation - either you need to have experience with similar solutions, or with ot...
 

Comparisons

 

Also Known As

No data available
SAP High-Performance Analytic Appliance, HANA
 

Overview

 

Sample Customers

NASA JPL, UC Berkeley AMPLab, Amazon, eBay, Yahoo!, UC Santa Cruz, TripAdvisor, Taboola, Agile Lab, Art.com, Baidu, Alibaba Taobao, EURECOM, Hitachi Solutions
Unilever, NHS 24, adidas Group, CHIO Aachen, Hamburg Port Authority (HPA), Bangkok Airways Public Company Limited
Find out what your peers are saying about Apache Spark vs. SAP HANA and other solutions. Updated: January 2025.
837,501 professionals have used our research since 2012.