What is your experience regarding pricing and costs for Apache Spark?

Spark provides programmers with an application programming interface centered on a data structure called the resilient distributed dataset (RDD), a read-only multiset of data items distributed over a cluster of machines, that is maintained in a fault-tolerant way. It was developed in response to limitations in the MapReduce cluster computing paradigm, which forces a particular linear dataflowstructure on distributed programs: MapReduce programs read input data from disk, map a function...

Download Apache Spark Report Read more

Related Q&As

Aug 28, 2023

Which solution has better performance: Spring Boot or Apache Spark?

Apr 19, 2020

Which is the best RDMBS solution for big data?

Aleksandr Motuzov Head of Data Science center of excellence at Ameriabank CJSC · Answer 1 · 2024-09-23T07:34:00Z

Compared to other solutions like Doc DB, Spark is more costly due to the need for extensive infrastructure. It requires significant investment in infrastructure, which can be expensive. While cloud solutions like Databricks can simplify the process, they may also be less cost-efficient.

score 0 · Answer 2 · 2024-08-05T11:22:30Z

I did not pay anything when using the tool on cloud services, but I had to pay on the compute side. The tool is not expensive compared with the benefits it offers. I rate the price as an eight out of ten.

Suriya Senthilkumar Analyst at Deloitte · Answer 3 · 2024-02-26T16:01:50Z

They provide an open-source license for the on-premise version. However, we have to pay for the cloud version including data centers and virtual machines.

Hamid M. Hamid Data architect at Banking Sector · Answer 4 · 2024-02-05T09:17:45Z

Hamid M. Hamid

Data architect at Banking Sector

Real User

Top 5Leaderboard

Feb 5, 2024

Apache Spark is an open-source tool. It is not an expensive product.

Suresh_Srinivasan Co-Founder at FORMCEPT Technologies · Answer 5 · 2024-01-31T11:05:00Z

Suresh_Srinivasan

Co-Founder at FORMCEPT Technologies

Real User

Top 10

Jan 31, 2024

The solution is moderately priced.

Sachin Shukre Sr Manager at a transportation company with 10,001+ employees · Answer 6 · 2023-12-06T10:45:56Z

It is quite expensive. In fact, it accounts for almost 50% of the cost of our entire project. If I propose using Spark for a project, one of the first questions I get from management is about the cost of Databricks Spark on the cloud platform we're using, whether it's Azure, GCP, or AWS. If we could reduce the collection, system conversion, and transformation network costs by even just 2% to 3%, it would be a significant benefit for us.

score 0 · Answer 7 · 2023-11-10T13:04:33Z

reviewer1283880

CEO International Business at a tech services company with 1,001-5,000 employees

MSP

Top 5

Nov 10, 2023

It is an open-source solution, it is free of charge.

Jagannadha Rao Lead Data Scientist at International School of Engineering · Answer 8 · 2023-10-20T07:41:27Z

Jagannadha Rao

Lead Data Scientist at International School of Engineering

Real User

Top 10

Oct 20, 2023

Apache Spark is an expensive solution.

Miodrag Milojevic Senior Data Archirect at Yettel · Answer 9 · 2023-07-25T11:39:52Z

Apache Spark is not too cheap. You have to pay for hardware and Cloudera licenses. Of course, there is a solution with open source without Cloudera. But in that case, you don't have any support. If you face a problem, you might find something in the community, but you cannot ask Cloudera about it. If you have open source, you don't have support, but you have a community. Cloudera has different packages, which are licensed versions of products like Apache Spark. In this case, you can ask Cloudera for everything.

score 0 · Answer 10 · 2023-07-17T11:51:53Z

reviewer1759647

Information Technology Business Analyst at a aerospace/defense firm with 10,001+ employees

Real User

Jul 17, 2023

We are using the free version of the solution.

score 0 · Answer 11 · 2023-02-13T20:14:00Z

Armando Becerril

Partner / Head of Data & Analytics at Intelligence Software Consulting

Real User

Top 10

Feb 13, 2023

Licensing costs depend on where you source the solution.

Ilya Afanasyev Senior Software Development Engineer at Yahoo! · Answer 12 · 2022-08-03T04:09:48Z

Ilya Afanasyev

Senior Software Development Engineer at Yahoo!

Real User

Aug 3, 2022

It's an open-source product. I don't know much about the licensing aspect.

Salvatore Campana CEO & Founder at Xautomata · Answer 13 · 2022-04-27T08:19:19Z

Salvatore Campana

CEO & Founder at Xautomata

Real User

Top 5

Apr 27, 2022

Spark is an open-source solution, so there are no licensing costs.

score 0 · Answer 14 · 2022-02-22T10:00:42Z

reviewer1185906

Manager - Data Science Competency at a tech services company with 201-500 employees

Consultant

Feb 22, 2022

This is an open-source tool, so it can be used free of charge. There is no cost involved.

Suresh_Srinivasan Co-Founder at FORMCEPT Technologies · Answer 15 · 2021-12-28T09:52:00Z

Since we are using the Apache Spark version, not the data bricks version, it is an Apache license version, the support and resolution of the bug are actually late or delayed. The Apache license is free.

Oscar Estorach Chief Data-strategist and Director at Theworkshop.es · Answer 16 · 2021-08-18T14:51:07Z

We use the open-source version. It is free to use. However, you do need to have servers. We have three or four. they can be on-premises or in the cloud.

NitinKumar Director of Enginnering at Sigmoid · Answer 17 · 2021-02-01T12:04:16Z

Apache Spark is open-source. You have to pay only when you use any bundled product, such as Cloudera.

Rajendran Veerappan Director at Nihil Solutions · Answer 18 · 2020-07-23T07:58:35Z

I'm unsure as to how much the licensing is for the solution. It's not an aspect of the product I deal with directly.

Gopi Krishnan Works at Ideas2IT Technologies · Answer 19 · 2020-06-10T05:20:34Z

Apache spark is available in cloud services like AWS cloud, Azure. We have to use the specific service for our use case. For example we can use AWS Glue which runs spark for ETL process, AWS EMR /Azurre data brick for on demand data processing in the cloud. Basically it depends on how much capacity we will processing the data. It is recommended to get started with minimal configuration and stop the services when not in use.

score 0 · Answer 20 · 2020-02-02T10:42:14Z

The initial setup is straightforward. It took us around one week to set it up, and then the requirements and creation of the project flow and design needed to be done. The design stage took three to four weeks, so in total, it required between four and five weeks to set up.

score 0 · Answer 21 · 2019-12-23T07:05:00Z

I would suggest not to try to do everything at once. Identify the area where you want to solve the problem, start small and expand it incrementally, slowly expand your vision. For example, if I have a problem where I need to do streaming, just focus on the streaming and not on the machine learning that Spark offers. It offers a lot of things but you need to focus on one thing so that you can learn. That is what I have learned from the little experience I have with Spark. You need to focus on your objective and let the tools help you rather than the tools drive the work. That is my advice.