StreamSets Reviews and Pricing

reviewer2041068

Senior Network Administrator at a energy/utilities company with 201-500 employees

Dec 29, 2022

Download

Helped us break down data silos and produce better, up-to-date reports, as well as save money

Pros and Cons

"The most valuable feature is the pipelines because they enable us to pull in and push out data from different sources and to manipulate and clean things up within them."

"The design experience is the bane of our existence because their documentation is not the best. Even when they update their software, they don't publish the best information on how to update and change your pipeline configuration to make it conform to current best practices. We don't pay for the added support. We use the "freeware version." The user community, as well as the documentation they provide for the standard user, are difficult, at best."

What is our primary use case?

We use the whole Data Collector application.

How has it helped my organization?

We now consume many more hundreds of terabytes of data than we used to before we had StreamSets. It has definitely enabled us to do things a lot faster, and be a lot more agile, with a lot more data consumption and a lot more reporting.

Another benefit is that it has helped us to break down data silos. We now consume data across different silos and then we aggregate it together so that we can do reporting that is not just for that one silo of people but for a number of different people across the entire organization. That has had a positive effect, enabling us to save money, spend money more effectively, and have more up-to-date data in reports, as well as in auditing. Our safety processes are better too.

One way we have saved money is thanks to how the solution streamlines the data that we pull in, data that we weren't pulling in before.

StreamSets allows more people to know what's going on. It helps us with better allocation of resources, better allocation of staff, and right-sizing. We're in oil and gas and, in our case, it allows us to optimize what we're pulling out of the ground and then what we're selling.

It has helped to scale our data operations and as a result, in addition to saving money and right-sizing, it's helped our field operations and provided us with more management reporting.

Also, the data drift resilience reduces the time it takes to fix data drift breakages.

What is most valuable?

The most valuable feature is the pipelines because they enable us to pull in and push out data from different sources and to manipulate and clean things up within them.

We use StreamSets to connect to enterprise data stores, including OLTP databases and Hadoop. Connecting to them is pretty easy. It's the data manipulation and the data streaming that are the harder parts behind that, just because of the way the tool is written.

What needs improvement?

The design experience is the bane of our existence because their documentation is not the best. Even when they update their software, they don't publish the best information on how to update and change your pipeline configuration to make it conform to current best practices.

We don't pay for the added support. We use the "freeware version." The user community, as well as the documentation they provide for the standard user, are difficult, at best.

However, we have a couple of people in-house here who are experts in data analysis and they have figured out how to use this tool. We have to have people who are extremely skilled to go in and write the pipelines for this software because it's so complicated. The software works great for us, but there is an extremely steep learning curve because they don't provide a lot of information outside of paying their ridiculous support costs. Their support starts at $50,000 a year and up.

Also, the built-in data drift resilience for ETL operations requires a bunch of custom code development to be able to handle that. It's somewhat difficult because you have to customize it a fair amount.

I also would like a more user-friendly interface and better error-trap handling.

Buyer's Guide

StreamSets

March 2025

Free Report: StreamSets Reviews and More

Learn what your peers think about StreamSets. Get advice and tips from experienced pros sharing their opinions. Updated: March 2025.

DOWNLOAD NOW

842,767 professionals have used our research since 2012.

For how long have I used the solution?

We have been using StreamSets for about four years.

What do I think about the stability of the solution?

We just patched ourselves up to the latest release about a month ago, so it's actually pretty stable at this point. It used to be quite buggy, going back over the last little while, but it's pretty stable now.

What do I think about the scalability of the solution?

This software is very scalable.

Which solution did I use previously and why did I switch?

We did not have a previous solution.

How was the initial setup?

The initial setup was somewhere between straightforward and complex. It was pretty straightforward to start with, but then it started ramping up to be more difficult as we wanted to add more stuff in.

The difficulty depends upon your data sources. If you have just one data source and you want to consume a lot of different types of data from that one source, it's pretty straightforward. But when you have 20 or 25 different data sources, and you need to pipeline all that data into a couple of data warehouses so that you can use advanced data analytics software to do reporting, analysis, and notifications, it's a lot more complicated. With every data source, it becomes exponentially more complicated to manage.

We spent a significant amount of time doing it, but otherwise, it was seamless because it was our own staff. We didn't have to worry about trying to find money or resource time or do any of the prep work needed to get external resources.

Ours is a single deployment, but it is used across our entire staff base of 200-plus people. We need three people for deployment and maintenance, whose responsibilities include software management, application management, and data analysis and management.

What was our ROI?

The ROI we have seen is in savings of time and money.

What's my experience with pricing, setup cost, and licensing?

We use the free version. It's great for a public, free release. Our stance is that the paid support model is too expensive to get into. They should honestly reevaluate that.

We tried to go and get them to look at their licensing and support model and they said they were not interested in reevaluating that in any way.

Which other solutions did I evaluate?

We tried to use another freeware ETL tool. It's fairly well-known. We ran it for a couple of months but it was going to be even more difficult than StreamSets, so we chose that in the end.

What other advice do I have?

The ease of using StreamSet to move data into modern analytics platforms, on a scale of one to 10, is about a five.

The solution enables you to build data pipelines without knowing how to code if it's the latest, state-of-the-art cloud connecting stuff. If it's for anything structured for Oracle and SQL Server and other data sources, it's difficult. Without knowing how to write code, some of it's easy and some of it is not.

My advice to someone who is considering this software is to be very aware that their integrator and data analysis people will need a very specific skill set.

Which deployment model are you using for this solution?

On-premises

Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.

BahatiAsher Faith

Software Developer at Appnomu Business Services

Mar 16, 2023

Download

Simplifies the way we perform tasks and engineer pipelines at all stages

Pros and Cons

"StreamSets Transformer is a good feature because it helps you when you are developing applications and when you don't want to write a lot of code. That is the best feature overall."

"The monitoring visualization is not that user-friendly. It should include other features to visualize things, like how many records were streamed from a source to a destination on a particular date."