What advice do you have for others considering Spark SQL?

Spark SQL is a Spark module for structured data processing. Unlike the basic Spark RDD API, the interfaces provided by Spark SQL provide Spark with more information about the structure of both the data and the computation being performed. There are several ways to interact with Spark SQL including SQL and the Dataset API. When computing a result the same execution engine is used, independent of which API/language you are using to express the computation. This unification means that developers...

Download Spark SQL Report Read more

Related Q&As

Aug 18, 2023

What is your experience regarding pricing and costs for Spark SQL?

Nov 23, 2023

What do you like most about Spark SQL?

SurjitChoudhury Data engineer at Cocos pt · Answer 1 · 2023-11-23T15:19:35Z

SurjitChoudhury

Data engineer at Cocos pt

Real User

Top 5Leaderboard

Nov 23, 2023

Overall, I would rate Spark SQL as a seven out of ten.

Slaven Batnozic CTO at Dokument IT d.o.o. · Answer 2 · 2023-08-18T08:37:21Z

I recommend Spark SQL, but I will need to see what the results will be of our evaluation of Dremio. I'm especially expecting good performance because of the reflection mechanisms, which are actually materials used. But the open question is issues with the refresh rate. I don't know how bad or good that is. I rate Spark SQL a ten out of ten with the correct implementation.

Aria Amini Data Engineer at Behsazan Mellat · Answer 3 · 2023-07-26T11:55:00Z

If the user data has a big volume of data, I think they should use PySpark, but for scenarios where they use a medium amount of data, they should not use PySpark because they have some overheads. I rate Spark SQL a nine out of ten.

Sahil Taneja Principal Consultant/Manager at Tenzing · Answer 4 · 2023-05-05T08:54:14Z

It's pretty good to use in the initial phases. Overall, I would rate the solution an eight out of ten.

Lucas Dreyer Data Engineer at BBD · Answer 5 · 2023-01-04T13:37:06Z

Training is quite important to get users up to scratch with Sparks SQL and Spark. Planning is needed in terms of training and skillsets. In terms of the typical DevOps MLOps deployment with pipelines, this training is particularly important. Otherwise you may end up with lots of functionality and queries that are difficult to change, deploy or maintain. I would rate this solution an eight out of ten. In terms of scalability, it is very useful.

score 0 · Answer 6 · 2022-11-22T13:27:47Z

KM

Keshav Mandal

Senior Analyst/ Customer Business and Insights Specialist at a tech services company with 501-1,000 employees

Real User

Leaderboard

Nov 22, 2022

The solution is very similar to the generic Spark and SQL language. I rate the solution an eight out of ten.

Mahdi Sharifmousavi Lecturer at Amirkabir University of Technology · Answer 7 · 2022-08-10T11:49:13Z

Mahdi Sharifmousavi

Lecturer at Amirkabir University of Technology

Real User

Aug 10, 2022

I recommend this solution. Spark provides good, clear documentation that is well organized.

reviewer1724670 Engineering Manager/Solution architect at Provectus · Answer 8 · 2021-12-02T15:07:38Z

reviewer1724670

Engineering Manager/Solution architect at Provectus

Vendor

Dec 2, 2021

I rate this solution an eight out of ten and would recommend it to others.

reviewer1488372 Associate Manager at a consultancy with 501-1,000 employees · Answer 9 · 2021-05-29T10:04:10Z

reviewer1488372

Associate Manager at a consultancy with 501-1,000 employees

Real User

May 29, 2021

I rate Spark SQL a ten out of ten.

score 0 · Answer 10 · 2020-09-27T04:10:00Z

reviewer1427205

Corporate Sales at a financial services firm with 10,001+ employees

Real User

Sep 27, 2020

Being a new user, I would rate Spark SQL a four out of ten.

Piotr Kalanski Cloud Team Leader at TCL · Answer 11 · 2020-04-26T06:32:00Z

I would rate Spark SQL a nine out of ten. My advice would be to read Databricks books about Spark. It's a good source of knowledge. In the next update, we'd like to see better performance for small points of data. It is possible but there are better tools that are faster and cheaper.

score 0 · Answer 12 · 2020-03-18T06:06:00Z

We will have a lot of big data, which is why we need it. Otherwise, the solution is not needed. The solution really depends on the size of your data, its complexity, and the analysis that you are doing. Spark is good, but it is not mandatory. Since I don't have experience in production with the solution, the best I can rate it now is a five (out of 10).

DulalMali Data Analytics Practice head at bse · Answer 13 · 2020-02-09T08:17:05Z

We use both the on-premises and cloud deployment models. We have a relationship with Cloudera and use their distribution channels. We don't have a relationship with Apache. Spark SQL is a good product. However, users need to have the capability of implementing the correct tools and efficiencies. I'd rate the solution seven out of ten.

score 0 · Answer 14 · 2019-07-16T05:40:00Z

We've just started using this solution. We were using it until recently on a research basis, just to measure the performance, the cost, and so on and so forth. Many things could be improved, but are okay up till now, I'm happy with. I would recommend the product. I would rate this solution eight out of ten.