What advice do you have for others considering Apache Spark?

Spark provides programmers with an application programming interface centered on a data structure called the resilient distributed dataset (RDD), a read-only multiset of data items distributed over a cluster of machines, that is maintained in a fault-tolerant way. It was developed in response to limitations in the MapReduce cluster computing paradigm, which forces a particular linear dataflowstructure on distributed programs: MapReduce programs read input data from disk, map a function...

Download Apache Spark Report Read more

Related Q&As

Aug 28, 2023

Which solution has better performance: Spring Boot or Apache Spark?

Apr 19, 2020

Which is the best RDMBS solution for big data?

Omar Khaled Data Engineer at a tech company with 10,001+ employees · Answer 1 · 2025-08-11T14:34:31Z

Omar Khaled

Data Engineer at a tech company with 10,001+ employees

Real User

Top 10

Aug 11, 2025

I am unable to suggest specific improvements at this time. This review has a rating of 10 out of 10.

KamleshPant Senior Software Architect at USEReady · Answer 2 · 2025-04-24T06:07:45Z

I rate my overall experience with Apache Spark as eight out of ten. I suggest leveraging AI capabilities to enhance performance or check for anomalies.

Bharghava Raghavendra Beesa Senior Developer at Infosys · Answer 3 · 2025-01-21T11:56:14Z

I recommend Spark for working with large-scale big data. It is crucial to have skilled technicians. Overall product rating: seven out of ten.

Aleksandr Motuzov Head of Data Science center of excellence at Ameriabank CJSC · Answer 4 · 2024-09-23T07:34:00Z

AM

Aleksandr Motuzov

Head of Data Science center of excellence at Ameriabank CJSC

Real User

Top 5Leaderboard

Sep 23, 2024

I'd rate the solution eight out of ten.

score 0 · Answer 5 · 2024-07-10T15:58:00Z

Apache Spark is my go-to solution for processing large-scale datasets. I would recommend it 100%. One of the main reasons is its ease of use. You can start using it on your laptop without any extra infrastructure, and then you can take that same code and run it anywhere else, including on the cloud. You're not locked in by any vendor, which is a significant advantage. Overall, I rate the solution a nine out of ten as a big data processing engine.

Suriya Senthilkumar Analyst at Deloitte · Answer 6 · 2024-02-26T16:01:50Z

Apache Spark is a good product for processing large volumes of data compared to other distributed systems. It provides efficient integration with Hadoop and other platforms. I rate it a ten out of ten.

Hamid M. Hamid Data architect at Banking Sector · Answer 7 · 2024-02-05T09:17:45Z

The tool is used for real-time data analytics as it is very powerful and reliable. The code that you write with Apache Spark provides stability. There are many bugs that can appear according to the code that you use, which could be Java or Scala. So this is amazing. Apache Spark is very reliable, powerful, and fast as an engine. When compared with another competitor like MapReduce, Apache Spark performs 100 times better than MapReduce. The monitoring part of the product is good. The product offers clusters that are resilient and can run into multiple nodes. The tool can run with multiple clusters. The integration capabilities of the product with other platforms to improve our company's workflow are good. In terms of the improvements in the product in the data analysis area, new libraries have been launched to support AI and machine learning. My company is able to process huge datasets with Apache Spark. There is a huge value added to the organization because of the tool's ability to process huge datasets. I rate the overall solution a nine out of ten.

Suresh_Srinivasan Co-Founder at FORMCEPT Technologies · Answer 8 · 2024-01-31T11:05:00Z

Suresh_Srinivasan

Co-Founder at FORMCEPT Technologies

Real User

Top 10

Jan 31, 2024

I rate the overall solution a ten out of ten.

Sachin Shukre Sr Manager at a transportation company with 10,001+ employees · Answer 9 · 2023-12-06T10:45:56Z

If your use case involves real-time applications frequently changing columns or data frames, then Spark is a fantastic option for you. However, if you have a batch process and don't have a structural data analysis, I would suggest avoiding it. The high cost of cloud infrastructure combined with Apache Spark can be a significant burden in such scenarios. Overall, I would rate the solution a nine out of ten.

score 0 · Answer 10 · 2023-11-10T13:04:33Z

reviewer1283880

CEO International Business at a tech services company with 1,001-5,000 employees

MSP

Top 5

Nov 10, 2023

I would give it a rating of seven out of ten, which, by my standards, is quite high.

Jagannadha Rao Lead Data Scientist at International School of Engineering · Answer 11 · 2023-10-20T07:41:27Z

Jagannadha Rao

Lead Data Scientist at International School of Engineering

Real User

Top 5

Oct 20, 2023

I would recommend Apache Spark to other users. Overall, I rate Apache Spark an eight out of ten.

Farzam Khodaei Data Engineer at Berief Food GmbH · Answer 12 · 2023-07-26T09:09:50Z

FK

Farzam Khodaei

Data Engineer at Berief Food GmbH

Real User

Jul 26, 2023

Overall, I rate the product more than eight out of ten.

score 0 · Answer 13 · 2023-07-06T10:55:23Z

I would recommend understanding the use case better. Only if it fits your use case, then go for it. But it is a great tool. Overall, I would rate Apache Spark an eight out of ten.

score 0 · Answer 14 · 2023-02-13T20:14:00Z

Armando Becerril

Partner / Head of Data & Analytics at Intelligence Software Consulting

Real User

Top 10

Feb 13, 2023

This is a good solution for big data use cases and I rate it eight out of 10.

Ilya Afanasyev Senior Software Development Engineer at Yahoo! · Answer 15 · 2022-08-03T04:09:48Z

I can recommend the product. It's a nice system for batch processing huge data. I'd rate the solution eight out of ten.

score 0 · Answer 16 · 2022-07-04T15:18:53Z

reviewer1904019

Chief Technology Officer at a tech services company with 11-50 employees

Real User

Jul 4, 2022

I rate Apache Spark an eight out of ten.

AmitMataghare Associate Director at PricewaterhouseCoopers · Answer 17 · 2022-04-27T08:19:23Z

AmitMataghare

Associate Director at PricewaterhouseCoopers

Real User

Apr 27, 2022

I rate Apache Spark an eight out of ten.

Salvatore Campana CEO & Founder at Xautomata · Answer 18 · 2022-04-27T08:19:19Z

Salvatore Campana

CEO & Founder at Xautomata

Real User

Top 5

Apr 27, 2022

I would rate Apache Spark eight out of ten.

Onur Tokat Big Data Engineer Consultant at Collective[i] · Answer 19 · 2022-02-15T16:44:00Z

Spark can handle small to huge data and is suitable for any size of company. I would rate Spark as eight out of ten.

Suresh_Srinivasan Co-Founder at FORMCEPT Technologies · Answer 20 · 2021-12-28T09:52:00Z

We are well versed in Spark, the version, the internal structure of Spark, and we know what exactly Spark is doing. The solution cannot be easier. Everything cannot be made simpler because it involves core data, computer science, pro-engineering, and not many people are actually aware of it. I rate Apache Spark a six out of ten.

Oscar Estorach Chief Data-strategist and Director at Theworkshop.es · Answer 21 · 2021-08-18T14:51:07Z

I have the solution installed on my computer and on our servers. You can use it on-premises or as a SaaS. I'd rate the solution at a nine out of ten. I've been very pleased with its capabilities. I would recommend the solution for the people who need to deploy projects with streaming. If you have many different sources or different types of data, and you need to put everything in the same place - like a data lake - Spark, at this moment, has the right tools. It's an important solution for data science, for data detectors. You can put all of the information in one place with Spark.

reviewer1535340 Senior Solutions Architect at a retailer with 10,001+ employees · Answer 22 · 2021-03-27T15:39:24Z

I would recommend Apache Spark to new users, but it depends on the use case. Sometimes, it's not the best solution. On a scale from one to ten, I would give Apache Spark a ten.

NitinKumar Director of Enginnering at Sigmoid · Answer 23 · 2021-02-01T12:04:16Z

I would definitely recommend Spark. It is a great product. I like Spark a lot, and most of the features have been quite good. Its initial learning curve is a bit high, but as you learn it, it becomes very easy. I would rate Apache Spark an eight out of ten.

Kürşat Kurt Software Architect at Akbank · Answer 24 · 2020-10-28T02:27:29Z

I would advise planning well before implementing this solution. In enterprise corporations like ours, there are a lot of policies. You should first find out your needs, and after that, you or your team should set it up based on your needs. If your needs change during development because of the business requirements, it will be very difficult. If you are clear about your needs, it is easier to set it up. If you know how Spark is used in your project, you have to define firewall rules and cluster needs. When you set up Spark, it should be ready for people's usage, especially for remote job execution. I would rate Apache Spark a nine out of ten.

Rajendran Veerappan Director at Nihil Solutions · Answer 25 · 2020-07-23T07:58:35Z

We're customers and also partners with Apache. While we are on version 2.6, we are considering upgrading to version 3.0. I'd rate the solution nine out of ten. It works very well for us and suits our purposes almost perfectly.

Gopi Krishnan Works at Ideas2IT Technologies · Answer 26 · 2020-06-10T05:37:19Z

I would say for some use case we don't have to go to Apache spark and it can be implemented using ordinary python,go or Java application. For some use cases if leveraging the usage of Apache Spark gives better performance and reduction of time we can go for Apache Spark. I would rate Apache spark 9 out of 10 for use cases that require it. I would advice using already cloud services for implementing Apache Spark.

score 0 · Answer 27 · 2020-02-02T10:42:14Z

KK

KamleshKhollam

Managing Consultant at a computer software company with 501-1,000 employees

Real User

Feb 2, 2020

I would rate this solution an eight out of ten.

Suresh_Srinivasan Co-Founder at FORMCEPT Technologies · Answer 28 · 2020-01-29T11:22:00Z

Suresh_Srinivasan

Co-Founder at FORMCEPT Technologies

Real User

Top 10

Jan 29, 2020

I would rate it a nine out of ten.

it_user1223676 Lead Consultant at a tech services company with 51-200 employees · Answer 29 · 2020-01-29T11:22:00Z

The advice that I would give to someone considering this solution is that the quality of data has key streaming capabilities like velocity. This means how quickly you are going to refer to the data. These things matter by designing the solution. We need to take these things out. I would rate Apache Spark an eight out of ten. To make it a ten they should improve the speed. The data storage capacity means we can inject somewhere in the user database in more efficient ways.

score 0 · Answer 30 · 2019-12-23T07:05:00Z

On a scale of 1 to 10, I'd put it at an eight. To make it a perfect 10 I'd like to see an improved configuration bot. Sometimes it is a nightmare on Linux trying to figure out what happened on the configuration and back-end. So I think installation and configuration with some other tools. We are technical people, we could figure it out, but if aspects like that were improved then other people who are less technical would use it and it would be more adaptable to the end-user.

Mohamed Ghorbel Director of BigData Offer at IVIDATA · Answer 31 · 2019-12-09T10:58:00Z

We use both on-premises and public and private cloud deployment models. We're partners with Databricks. I'm a consultant. Our company works for large enterprises such as banks and energy companies. 17 of our workers use Apache Spark. With the cloud, there are many companies that integrate Spark. Most projects in big data around the world use Spark, indirectly or directly. I'd rate the solution eight out of ten.

score 0 · Answer 32 · 2019-10-13T05:48:00Z

The work that we are doing with this solution is quite common and is very easy to do. My advice for anybody who is implementing this solution is to look at their needs and then look at the community. Normally, there are a lot of people who have already done what you need. So, even without experience, it is quite simple to do a lot of things. I would rate this solution a nine out of ten.

Snrsecengin567 Snr Security Engineer at Securonix Solutions · Answer 33 · 2019-07-14T10:21:00Z

LC

Snrsecengin567

Snr Security Engineer at Securonix Solutions

Real User

Jul 14, 2019

I would rate this solution eight out of 10.

score 0 · Answer 34 · 2019-07-10T12:01:00Z

I would recommend the solution. I would rate it an eight or nine out of 10. For some areas, I would give it ten but I cannot use some parts. If you are going to use it for a consumer then I would be able to recommend it and you should go ahead. It doesn't work for me as I have different clients and different engagements.