We performed a comparison between Apache Spark and QueryIO based on real PeerSpot user reviews.
Find out what your peers are saying about Apache, Cloudera, Amazon Web Services (AWS) and others in Hadoop."Provides a lot of good documentation compared to other solutions."
"One of Apache Spark's most valuable features is that it supports in-memory processing, the execution of jobs compared to traditional tools is very fast."
"Spark helps us reduce startup time for our customers and gives a very high ROI in the medium term."
"Features include machine learning, real time streaming, and data processing."
"Now, when we're tackling sentiment analysis using NLP technologies, we deal with unstructured data—customer chats, feedback on promotions or demos, and even media like images, audio, and video files. For processing such data, we rely on PySpark. Beneath the surface, Spark functions as a compute engine with in-memory processing capabilities, enhancing performance through features like broadcasting and caching. It's become a crucial tool, widely adopted by 90% of companies for a decade or more."
"The main feature that we find valuable is that it is very fast."
"I found the solution stable. We haven't had any problems with it."
"I appreciate everything about the solution, not just one or two specific features. The solution is highly stable. I rate it a perfect ten. The solution is highly scalable. I rate it a perfect ten. The initial setup was straightforward. I recommend using the solution. Overall, I rate the solution a perfect ten."
"Anyone who has even a little bit of knowledge of the solution can begin to create things. You don't have to be technical to use the solution."
"Spark could be improved by adding support for other open-source storage layers than Delta Lake."
"The product could improve the user interface and make it easier for new users."
"When you first start using this solution, it is common to run into memory errors when you are dealing with large amounts of data."
"The solution needs to optimize shuffling between workers."
"Stability in terms of API (things were difficult, when transitioning from RDD to DataFrames, then to DataSet)."
"I would like to see integration with data science platforms to optimize the processing capability for these tasks."
"When using Spark, users may need to write their own parallelization logic, which requires additional effort and expertise."
"There could be enhancements in optimization techniques, as there are some limitations in this area that could be addressed to further refine Spark's performance."
"There needs to be some simplification of the user interface."
Earn 20 points
Apache Spark is ranked 1st in Hadoop with 60 reviews while QueryIO is ranked 16th in Hadoop. Apache Spark is rated 8.4, while QueryIO is rated 8.0. The top reviewer of Apache Spark writes "Reliable, able to expand, and handle large amounts of data well". On the other hand, the top reviewer of QueryIO writes "Stable with good connectivity and good integration capabilities". Apache Spark is most compared with Spring Boot, AWS Batch, Spark SQL, SAP HANA and Cloudera Distribution for Hadoop, whereas QueryIO is most compared with Splice Machine.
See our list of best Hadoop vendors.
We monitor all Hadoop reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.