The primary use case of this solution is for streaming data. It can stream large amounts of data in small data chunks which are used for Databricks data. I've been using the solution for personal research purposes only and not for business applications. I'm a customer of Apache.
Sr Technical Analyst at Sumtotal
Very fast with low latency data on data transformations
Pros and Cons
- "It's the fastest solution on the market with low latency data on data transformations."
- "The initial setup is quite complex."
What is our primary use case?
What is most valuable?
Data streaming would be the best feature of Spark and that includes when it's compared to Hadoop or Hive or Cassandra. It's the fastest solution on the market with low latency data on data transformations. I like that it's open source and easy to integrate with other data sources.
What needs improvement?
The initial setup is quite complex.
For how long have I used the solution?
I've been using this solution for six months.
Buyer's Guide
Streaming Analytics
January 2025
Find out what your peers are saying about Apache, Amazon Web Services (AWS), Microsoft and others in Streaming Analytics. Updated: January 2025.
831,265 professionals have used our research since 2012.
What do I think about the stability of the solution?
The solution is stable.
What do I think about the scalability of the solution?
The solution is easily scalable in the cloud as per the limitations of the subscription.
Which solution did I use previously and why did I switch?
I have previously used a variety of different streaming platforms. I have written a paper analyzing various solutions for efficient streaming for cluster analysis and have published it. I found that Spark has the most features and is the quickest solution compared to the others when it comes to the transformation of data without any latencies or issues.
How was the initial setup?
With a few commands it's possible to install. I installed it in a Linux environment. That said, the initial setup is complex because we have to learn either Java or Scala language. Spark Streaming has a few features in GitHub and its libraries, so we need to get some code to maintain some methods or functions to integrate with any data sources, and then we'll try to run those integrations. It may be that only a high level programmer familiar with Scala and Java can implement. There are quite extensive pre-requirements for using it properly.
What's my experience with pricing, setup cost, and licensing?
I'm using the open-source version of Spark, so there are no licensing costs.
What other advice do I have?
It's important to be familiar with Spark Streaming and Spark libraries, because familiarity with those scripts and coding languages makes it easier to work with the Spark code ecosystem to get the integrations of Spark Streaming or any Spark cluster creations.
I rate this solution eight out of 10.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Enterprise Data Architect at a pharma/biotech company with 11-50 employees
Provides real-time data processing capabilities with efficient reliability
Pros and Cons
- "The platform’s most valuable feature for processing real-time data is its ability to handle continuous data streams."
- "Integrating event-level streaming capabilities could be beneficial."
What is most valuable?
The platform’s most valuable feature for processing real-time data is its ability to handle continuous data streams.
What needs improvement?
The product's event handling capabilities, particularly compared to Kaspersky, need improvement. Integrating event-level streaming capabilities could be beneficial. This aligns with the idea of expanding Spark's functionality to cover unaddressed areas, potentially enhancing its competitiveness.
For how long have I used the solution?
We have been using Apache Spark Streaming for five years.
What's my experience with pricing, setup cost, and licensing?
Spark is an affordable solution, especially considering its open-source nature. However, it could use support from experienced companies to resolve any issues effectively.
What other advice do I have?
Spark does not encounter integration issues, particularly due to its utilization of JDBC connectors. These connectors facilitate seamless integration with third-party solutions. Furthermore, successful integration with tools like SAP HANA indicates its versatility in handling various data sources. Additionally, its performance surpasses Informatica in certain scenarios, especially when real-time streaming capabilities are crucial. It remains a preferred choice for businesses requiring efficient real-time data processing.
I rate it an eight.
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
Last updated: May 26, 2024
Flag as inappropriateBuyer's Guide
Streaming Analytics
January 2025
Find out what your peers are saying about Apache, Amazon Web Services (AWS), Microsoft and others in Streaming Analytics. Updated: January 2025.
831,265 professionals have used our research since 2012.
Chief Innovation & Technology Leader at a mining and metals company with 1,001-5,000 employees
Efficient, better then average, but overly developer-focused
Pros and Cons
- "The solution is better than average and some of the valuable features include efficiency and stability."
- "There could be an improvement in the area of the user configuration section, it should be less developer-focused and more business user-focused."
What is our primary use case?
The primary use of the solution is to implement predictive maintenance qualities.
What is most valuable?
The solution is better than average and some of the valuable features include efficiency and stability.
What needs improvement?
There could be an improvement in the area of the user configuration section, it should be less developer-focused and more business user-focused. For example, it is still not plug and play and use as some of the cloud offerings that come ready to use. It is not up there in the reading leading edge.
For how long have I used the solution?
I have been using this solution for approximately one and a half year.
What do I think about the stability of the solution?
The solution is very stable.
How was the initial setup?
The initial setup is developer-focused but it is not very complex. I can set up a stream in less than an hour. It will stream but It will not be a production-ready stream.
What other advice do I have?
I rate Apache Spark Streaming a six out of ten.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Buyer's Guide
Download our free Streaming Analytics Report and find out what your peers are saying about Apache, Amazon Web Services (AWS), Microsoft, and more!
Updated: January 2025
Product Categories
Streaming AnalyticsPopular Comparisons
Databricks
Confluent
Azure Stream Analytics
Amazon Kinesis
Apache Flink
Amazon MSK
Spring Cloud Data Flow
Starburst Enterprise
Cloudera DataFlow
Apache Pulsar
Aiven Platform
Talend Data Streams
SAS Event Stream Processing
Buyer's Guide
Download our free Streaming Analytics Report and find out what your peers are saying about Apache, Amazon Web Services (AWS), Microsoft, and more!
Quick Links
Learn More: Questions:
- How do you select the right cloud ETL tool?
- What is the best streaming analytics tool?
- What are the benefits of streaming analytics tools?
- What features do you look for in a streaming analytics tool?
- When evaluating Streaming Analytics, what aspect do you think is the most important to look for?
- Why is Streaming Analytics important for companies?