We were receiving data from hospitals or any kind of healthcare service providers in the country. We were dominantly operating in the US. When we received that data, we had to classify it into different repositories or different datasets. This data was sent to different vendors, and for that, the data needed to get processed in different ways. We needed to bifurcate data at many steps with different kinds of filters. For that, we used StreamSets.
Product Manager at a hospitality company with 51-200 employees
Provides a good bifurcation rate and accuracy, and saves time and money
Pros and Cons
- "The ability to have a good bifurcation rate and fewer mistakes is valuable."
- "One thing that I would like to add is the ability to manually enter data. The way the solution currently works is we don't have the option to manually change the data at any point in time. Being able to do that will allow us to do everything that we want to do with our data. Sometimes, we need to manually manipulate the data to make it more accurate in case our prior bifurcation filters are not good. If we have the option to manually enter the data or make the exact iterations on the data set, that would be a good thing."
What is our primary use case?
How has it helped my organization?
We could bifurcate the datasets that we received from different hospitals. We could bifurcate it on the basis of the medical requirements of the hospitals, and sometimes, on the basis of the schedule or purpose. We were obtaining data that we could then supply to some consulting firms or other sources.
StreamSets saved us time. The accuracy was pretty good, and it was definitely better than what we were using previously. Earlier, we had hired two people who were doing the job manually, and we were also using some other platform. We had to pay for them. Overall, we have saved a lot of time, and the accuracy has improved as well. We didn't calculate the time savings, but I believe we saved about three days in a week, so there were about 30% to 40% time savings.
StreamSets reduced the workload. There was a 10% to 15% reduction in the workload.
StreamSets helped us to scale our data operations. The limit at which we purchased this solution was incredible. We were never able to reach the limit that we purchased, but it helped us to increase or scale our operation. Especially in months when we received a higher number of entries, we were able to perform our work on time.
What is most valuable?
The ability to have a good bifurcation rate and fewer mistakes is valuable. In the scenario we had, when we had to bifurcate the data, we did not completely cut the data. We made a different route for one set of data, which went into a different operating system. There was also a complete set of data along with the original data that got cut, which once again went through the filtration process, and in this way, it kept on happening. Different solutions that were in place were not providing this feasibility. With the other solutions that we were using earlier, we had to reuse the data again and again from the start. It was a time-taking process.
Their support system was pretty good. When we were setting up the bifurcation protocols that we wanted to set up, we had a few support calls with them, and those were really helpful.
What needs improvement?
The design or the way they have set up the protocol is pretty good. One thing that I would like to add is the ability to manually enter data. The way the solution currently works is we don't have the option to manually change the data at any point in time. Being able to do that will allow us to do everything that we want to do with our data. Sometimes, we need to manually manipulate the data to make it more accurate in case our prior bifurcation filters are not good. If we have the option to manually enter the data or make the exact iterations on the data set, that would be a good thing. It does not have that feature. None of the solutions provides this feature, but this is the feature that we are looking for. If we could bifurcate the data or do manual manipulation of data at any point in time, it would be a game changer.
Its initial setup could also be a bit easier.
Buyer's Guide
StreamSets
October 2024
Learn what your peers think about StreamSets. Get advice and tips from experienced pros sharing their opinions. Updated: October 2024.
814,649 professionals have used our research since 2012.
For how long have I used the solution?
I used this solution for about a year.
What do I think about the stability of the solution?
It's a stable product. We used it for about a year, and we hardly had to shut it down.
What do I think about the scalability of the solution?
We are a medium enterprise. We only have three departments in our company, and only one of the departments is using it. Salespeople don't use it. The development people don't use it. We are the ones using it, and our job is to process the information, so only one department is using the solution. We have about 18 people in the department.
Up to medium enterprises, it's a good choice. You can scale between one million to ten million data files. I don't believe they offer the service for a hundred million or one billion datasets. It isn't too scalable for large enterprises, but for small and medium enterprises, it's good.
How are customer service and support?
I'd rate them an eight out of ten. The only reason for not giving them a ten out of ten is that if you're doing very important work and you need to get the solution the same day, it's a bit tough to have the team support you in a very short period of time. They usually give you appointments about a day or two days later. Other than that, everything is good.
How would you rate customer service and support?
Positive
Which solution did I use previously and why did I switch?
We were using another solution previously. The major reason for switching to StreamSets was that we needed to scale our operations. Our prior solution could have been scaled, but the cost of scaling was a bit higher. We would have had to hire one more person to be able to scale, but we did not want to hire more people, so we decided to use a completely automated solution for this part so that it could be handled by only one of our team members. That was the primary requirement. The cost-benefit analysis was done by one of our peers. His proposal was pretty good, and everyone agreed to it.
How was the initial setup?
Its initial setup is a bit tough. You need to have the technical expertise to do that. The support team is good. They help you around, but if they could make it a bit easier, it would be better.
I believe it operates only from the cloud. We also received the data from our associations on the cloud. We processed it on the cloud, and everything happened on the cloud.
The initial setup was complex because we were not able to directly link the data we were receiving with the StreamSets solution. Linking it required us to fill in or enter some information in StreamSets, but we were not able to figure out what to enter. For that part, we needed their help.
We spent about a week. For the first three days, our team members were trying their best to do it, but then we had to schedule a meeting with them. In terms of the number of people, only one person was working with our team, and there were three people working with the product. I was also involved in the product as a product manager, but I was not directly operating that system.
It didn't require any maintenance as such. Any maintenance activities were related to our side of things. There were mistakes on our end. When we were entering different data, we had to do different configurations in the system.
What was our ROI?
We did the cost-benefit analysis before buying the solution, and it performed even better than that. We were able to replace two of our staff members who were doing this work. The cost that we paid for this solution was pretty less as compared to their salaries, so on the cost-benefit side of things, it was a good deal. We saved about two persons' manual wage, which is about $6,000 a month, and we also saved 15% of a week's time. These two were the biggest returns on the investment. The accuracy was also a bit higher.
What's my experience with pricing, setup cost, and licensing?
Its pricing is pretty much up to the mark. For smaller enterprises, it could be a big price to pay at the initial stage of operations, but the moment you have the Seed B or Seed C funding and you want to scale up your operations and aren't much worried about the funds, at that point in time, you would need a solution that could be scaled. Simultaneously, you need a solution that you don't want to use on a very long-term basis. This solution could not be applied if we were operating with all the hospital chains in the US. We were operating just with one hospital. That's why it worked pretty well, so for medium enterprises, I believe it's very good.
What other advice do I have?
To those evaluating StreamSets, I'd advise doing a cost-benefit analysis because the way of using StreamSets differs from person to person. Someone else might have a very different use case, and they may not run into profit using the solution. For us, it was a good solution because we were hiring people for this work. People were doing the job manually. We saved both time and money, so doing a cost-benefit analysis would be the best thing.
If you are looking to expand your domain or range of operations, StreamSets is very helpful. If you are just looking for a better data analytics tool that can do bifurcation on data, I believe there are other tools or services available in the market that do not focus on the expansion of operations. They focus on doing better and more complex bifurcations.
StreamSets enables you to build data pipelines without knowing how to code. After generating a few responses, you have to enter some basic syntax or code, but generally, one can do a lot of no-code stuff, which was not an important aspect for us because we were operating in the IT space, and our entire team was capable of entering all the syntaxes that were required. It was not an issue for us at any point in time. In fact, in the operations that we were performing, we only used code. When we were testing out our initial datasets, we used some no-code features that were there, but at the later stage, we used only syntaxes.
We did not connect to the messaging systems, but we connected some enterprise databases. We were operating with a set of hospitals in the US, and we had to connect with them only the first time. Afterward, it was the data that was passing through the pipeline. Initially, for a completely new user, it's a bit tricky. Some technical expertise is required. It's a bit tough, but because the support team is there, one would be able to do it.
Overall, I would rate StreamSets an eight out of ten.
Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
AI Engineer at Techvanguard
A no-code solution with a drag-and-drop UI, but the execution engine should be better
Pros and Cons
- "The most valuable would be the GUI platform that I saw. I first saw it at a special session that StreamSets provided towards the end of the summer. I saw the way you set it up and how you have different processes going on with your data. The design experience seemed to be pretty straightforward to me in terms of how you drag and drop these nodes and connect them with arrows."
- "The execution engine could be improved. When I was at their session, they were using some obscure platform to run. There is a controller, which controls what happens on that, but you should be able to easily do this at any of the cloud services, such as Google Cloud. You shouldn't have any issues in terms of how to run it with their online development platform or design platform, basically their execution engine. There are issues with that."
What is our primary use case?
I was working on an integration project where I was using the StreamSets platform. I was looking at both their data collector and their transformer. The idea was to integrate it with AWS SageMaker Canvas. Both of them are what they call no-code options. StreamSets is for data pipelining, managing your data flow, and transforming your data. SageMaker is AWS, and Canvas is basically their no-code option for machine learning.
I was trying to connect it to a data object repository. For AWS, that's a specific managed service called S3. I wasn't trying to run it with a data warehouse.
How has it helped my organization?
It's still in the trial stage. I don't get a 30-day trial period or anything like that. I just got to write about what's involved and then see if that's something that justifies the use case for going ahead and purchasing the license for it.
It enables you to build data pipelines without knowing how to code. It abstracts away the need for Spark or anything like that. This ability is highly important because it reduces development time.
It saves time because you don't have to write code.
It saves money by not having to hire people with specialized skills. You don't need Spark or anything like that for doing the same thing.
It helps to scale your data operations. You can get to the execution engine and provision bigger machines or bigger clusters. You can scale out to however much data you need to scale out to.
What is most valuable?
The most valuable would be the GUI platform that I saw. I first saw it at a special session that StreamSets provided towards the end of the summer. I saw the way you set it up and how you have different processes going on with your data. The design experience seemed to be pretty straightforward to me in terms of how you drag and drop these nodes and connect them with arrows.
What needs improvement?
The execution engine could be improved. When I was at their session, they were using some obscure platform to run. There is a controller, which controls what happens on that, but you should be able to easily do this at any of the cloud services, such as Google Cloud. You shouldn't have any issues in terms of how to run it with their online development platform or design platform, basically their execution engine. There are issues with that.
It can break down data silos within the organization. One person can do the whole thing with StreamSets and SageMaker Canvas, but it hasn't yet had any effect on our operations or business because it's one of those situations where you can either get a demo from them or you basically have to go to one of these sessions and they give you temporary credentials and try to work with your use case. Personally, I would change their model a bit and give a two-week trial license for a cloud platform at the very least. You can then try to get something to work or call up their technical department and say, "Look, I've been evaluating this thing for the last few days. I don't know exactly how to resolve this issue."
For how long have I used the solution?
I started using it in June of this year.
What do I think about the stability of the solution?
The whole issue of the execution engine needs to be better resolved. If you pick a cloud, why isn't it working with this cloud? Or what do I need to do to get it to work with one specific cloud service if it can be deployed across multiple clouds?
What do I think about the scalability of the solution?
It seems pretty highly scalable to me. That's not going to be an issue. Just the administration of it could be an issue.
It's currently being used in a dev department for machine learning. It's being used by the business analyst team.
How are customer service and support?
I haven't contacted their support.
Which solution did I use previously and why did I switch?
AWS has native solutions. There are AWS Data Wrangler and others that come bundled with their services, like AWS Glue. We haven't yet switched to StreamSets. It's still in the evaluation stage, but the no-code and the drag-and-drop option with a GUI are some of the things that seem to resonate with people.
How was the initial setup?
I was involved in its setup. I was the one who basically had to try to get it to run with whatever process or custom processor I developed.
It was complex to set up. I had to go to the sessions. On a couple of occasions, I was doing it directly from the cloud platform, and apparently, that wasn't the way to do it. You have to go through their universal designer platform first.
In terms of maintenance, once you're deployed from the cloud, that's all handled for you. It's managed for you directly from the cloud service. So, you don't have to worry about that. They maintain their design platform.
What about the implementation team?
I didn't use any consultant.
What's my experience with pricing, setup cost, and licensing?
I didn't get into that with the StreamSets representative. It seems to be pay-as-you-go, but I don't know exactly how they do it.
Which other solutions did I evaluate?
Alteryx is another option. It's a similar tool, and it looks almost the same as StreamSets. Alteryx is something that's available for any cloud. It doesn't matter which cloud. You go on the various clouds, and you look and see what they have.
What other advice do I have?
To those evaluating this solution, I would advise looking into how it integrates with the cloud service that they're going to try it with. Does it naturally integrate better with AWS or Azure? It's one of those situations.
I used StreamSets' ability to move data into a modern analytics platform. That's what the AWS SageMaker Canvas is. It's like predictive analytics. In terms of ease of moving data into this analytics platform, doing the design on the StreamSets platform is one thing, but having the execution engine and getting that provision is a totally different ball game. Basically, that's where its limitation comes in.
Overall, I would rate it a seven out of ten. The issue that was never resolved for me was if you're running a compute or execution engine on AWS versus Azure versus GCP, how does that integration work because that has got nothing to do with StreamSets? That is outside of StreamSets. You're now dealing with the cloud service, and there's a good reason for that.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
Buyer's Guide
StreamSets
October 2024
Learn what your peers think about StreamSets. Get advice and tips from experienced pros sharing their opinions. Updated: October 2024.
814,649 professionals have used our research since 2012.
Software Engineer at Soft Hostings Limited
It's lightweight and well-integrated, and it saves a lot of money and time
Pros and Cons
- "What I love the most is that StreamSets is very light. It's a containerized application. It's easy to use with Docker. If you are a large organization, it's very easy to use Kubernetes."
- "There aren't enough hands-on labs, and debugging is also an issue because it takes a lot of time. Logs are not that clear when you are debugging, and you can only select a single source for a pipeline."
What is our primary use case?
StreamSets is being used in the IT department to make sure that we have a stable solution and that our configuration is secure and running smoothly. We are using it for our data analytic tool as well as for real-time prediction for various real-life business use cases. It's helping us in generating new business ideas. It's a tool that allows us to share data between platforms, which also removes the dependency on other ETL tools, such as SSIS.
How has it helped my organization?
StreamSets is straightforward to use for implementing batch, streaming, or ETL pipelines once you know how to use it. The pipeline can be integrated with Azure Key Vault, which eliminates the need of sharing credentials with developers. The same goes for parameters. It's very easy and straightforward.
It's easy for me to connect StreamSets to enterprise data stores such as OLTP databases and Hadoop, or messaging systems such as Kafka. I've got a good experience with it, and I've been working with it for a long time. It's very easy to connect and integrate for me. However, if you are a beginner, it might not go that well in the first step.
It's easy to move data into analytics platforms using StreamSets.
StreamSets enables us to build data pipelines without knowing how to code. We don't require the best coding skills. We can use the code-free environment to quickly create pipelines. It's very helpful for that.
StreamSets is a helpful tool for pipelines. It's very easy, so we can register data collectors to control hubs using provisioning agents.
StreamSets has helped to break down data silos within our organization. It hasn't negatively affected our business. It has fortunately enhanced our development time. We are able to develop secure, stable platforms faster and even remotely.
StreamSets has saved us a lot of time. It saved us the time that we were spending developing applications manually. One budget can be used by the team to come up with a stable solution. Our time savings are 30%. Out of five hours, it has saved us around two hours.
StreamSets has reduced our workload by 35%. It has also saved us money. When you subscribe to StreamSets, it seems very expensive, but when you get to know how their integration and documentation are and how things move, it's definitely efficient. It saves a lot of money. Before implementing it, we spent around 10,000 USD to hire experts. It has saved us 10,000 USD that we would have spent on hiring experts.
What is most valuable?
What I love the most is that StreamSets is very light. It's a containerized application. It's easy to use with Docker. If you are a large organization, it's very easy to use Kubernetes.
It has a very easy and user-friendly interface. It only takes a few days for new developers to start and deploy their first pipeline. It provides an easy and powerful integrated environment with different platforms such as Kafka, Salesforce, Oracle Database, REST API, etc. The user interface is a powerful feature of StreamSets.
What needs improvement?
There are so many things that need to be improved. For the StreamSets cloud user interface, there aren't enough use cases and examples for the main problems. In addition, the hybrid data sets cannot be joined in a data connector, which is a significant limitation.
There aren't enough hands-on labs, and debugging is also an issue because it takes a lot of time. Logs are not that clear when you are debugging, and you can only select a single source for a pipeline. It isn't helpful when you need to apply the same logic for multiple sources. It becomes difficult because you need to create more pipelines and then add coordination between them.
Initially, it's hard to find out or master the logic behind it. It can be hard if you aren't technical enough. There is scope for improvement because it's not straightforward. You need to go through the documentation and make sure that you understand every step. For me, it was a challenging model.
For how long have I used the solution?
I've been using StreamSets for two and a half years.
What do I think about the stability of the solution?
It's stable enough.
What do I think about the scalability of the solution?
It's good enough. We don't use it at multiple locations. We use it at one location, and it's being used by the IT and development departments. We have five users who are using it.
How are customer service and support?
Its deployment was hard. I had to contact them so that they could help me set things up. They are good people. They make sure that you are getting the best experience and that you are getting things in the right way. Their support is good and technical. I'd rate them a 10 out of 10 because of the fact that they were able to troubleshoot the issue.
How would you rate customer service and support?
Positive
Which solution did I use previously and why did I switch?
We did not use a different solution.
How was the initial setup?
In the beginning, it's very hard, but after reading the documentation, you can set up things easily. The documentation is very good and helpful.
For me, deployment was initially very hard because it required a lot of technical skills that I didn't have at that time. I had to contact the team, and they helped me with how to deploy it. The following day, I was able to set up everything. So, deployment is initially very hard, but after you become familiar with StreamSets, you can deploy it more easily.
What about the implementation team?
I deployed it myself. It doesn't require any maintenance because they take care of that.
What was our ROI?
There has been a great return on investment. We can use a single package of one thousand USD to have different applications with different people and different skills. It has saved us the money that we would have spent individually to develop those applications. Using StreamSets has saved us expenses. We have seen 40% ROI.
What's my experience with pricing, setup cost, and licensing?
It's not so favorable for small companies.
Which other solutions did I evaluate?
We didn't evaluate other options. We found StreamSets to be aligned with our expectations.
What other advice do I have?
To those evaluating this solution, I'd advise ensuring that they have someone who is an expert in StreamSets so that you can deploy it in less time. Otherwise, it won't be a great option.
I'd recommend StreamSets if you want to design a very good pipeline, but you also have to think about the budget. Its budget is not so favorable for small companies, but it's great software for businesses that want to create good data pipelines and have secure platforms. It will help your business in making sure that you are providing a stable solution to your clients.
Overall, I'd rate StreamSets a 10 out of 10.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
Last updated: Sep 18, 2024
Flag as inappropriateSenior Software Developer at a tech vendor with 10,001+ employees
Eradicated our data silos, integrating all data files into one central system
Pros and Cons
- "The ETL capabilities are very useful for us. We extract and transform data from multiple data sources, into a single, consistent data store, and then we put it in our systems. We typically use it to connect our Apache Kafka with data lakes. That process is smooth and saves us a lot of time in our production systems."
- "The software is very good overall. Areas for improvement are the error logging and the version history. I would like to see better, more detailed error logging information."
What is our primary use case?
The main use case of StreamSets is to work on data integration and ingesting data for DataOps and modern analytics. We also use it for integrating data files from multiple sources. We use it to build, monitor, and manage smart, continuous data pipelines.
How has it helped my organization?
The introduction of StreamSets in our organization has improved things in a significant way. The efficiency of our entire process has increased a lot and we derive high value from it. The integration of data files from multiple sources is what makes it great software for us.
The transfer of information between our teams is very smooth and efficient as well. It saves us time in transferring, collating, and integrating all of the data.
The integration part has been customized for our particular systems. Previously, we had different data silos. Now, with the introduction of StreamSets, the data silo approach has been eradicated. It has integrated all the data files into one software system, creating a central point for it.
And it has reduced our workload by 50 to 60 percent and that has definitely saved us some money on human resources.
What is most valuable?
There are two features that are most valuable for us. One is the Control Hub and the other is the Data Collector. With Data Collector, data migration has become much easier for us.
Also, the ETL capabilities are very useful for us. We extract and transform data from multiple data sources, into a single, consistent data store, and then we put it in our systems. We typically use it to connect our Apache Kafka with data lakes. That process is smooth and saves us a lot of time in our production systems.
We use the platform to incorporate modern analytics as well. That is one of our main use cases. It integrates well with our requirements. It is quite easy to move data into these analytics platforms using StreamSets because there are minimal coding requirements. The built-in applications and systems allow us to do it with ease. A first-time user could easily do it.
If there were coding requirements, it would take three or four extra resources to get things done. That aspect is very important for us. It saves us money by not needing coding manpower.
In addition, the system's data drift resilience is very effective and efficient. On our particular team, it has reduced the time it takes to fix data drift breakages by 10 to 12 man-hours per week.
What needs improvement?
The software is very good overall. Areas for improvement are the error logging and the version history. I would like to see better, more detailed error logging information. Apart from that, I don't think much improvement is required, because the software and features are very good.
For how long have I used the solution?
I have been using StreamSets for the past year.
What do I think about the stability of the solution?
The software is very stable. The stability is a solid 10 out of 10.
What do I think about the scalability of the solution?
It's definitely scalable. We started with around 10 to 12 users, and now it has reached 35 to 40 users in our particular organization. We are now using it across four to five teams.
There are a lot of other teams in our company that are trying out the free version of the software. If it's suitable for them, they will obviously go for it as well.
How are customer service and support?
Through email, they have been very good at supporting us and they're very knowledgeable as well. They are going to various lengths to provide us with clear-cut answers.
How would you rate customer service and support?
Positive
Which solution did I use previously and why did I switch?
We didn't use any other similar software.
What was our ROI?
It took three to four months to assess the efficiency improvements in our team. There's definitely a return on investment from the use of StreamSets. Our efficiency has been increased by 20 to 25 percent and it has helped increase revenue by 7 to 10 percent.
What's my experience with pricing, setup cost, and licensing?
I imagine the pricing is moderate because our company is renewing its license, but I'm not sure about the exact price. There are no hidden costs that I have come across.
What other advice do I have?
It's cloud-based software, so there are only minimal maintenance requirements. Our IT team takes care of the maintenance of the software, but I don't think much time is required for that. Only regular updates need to be done. It is a minimal task that can be done by one or two personnel.
Overall, it provides us a lot with efficiency and increases the effectiveness of our transformation of data sets. The value and increase in revenue it has helped us achieve make it a very good software package.
Try the free version and, if the software meets your requirements, I would definitely say get the Enterprise version. It's pretty easy to understand and it generates a great deal of smoothness for your business processes. It's a must-have for every business to improve its efficiency and effectiveness.
The major takeaway for me has to be the improvement in the efficiency of our entire process. That stands out for us. StreamSets is a great platform. And the best thing about it is that there are minimal coding requirements. Any person, even someone with a non-technical background, can easily get accustomed to the software and start using it.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
Technical Lead at Sopra Steria
Easy-to-use tool with no coding required
Pros and Cons
- "StreamSets’ data drift resilience has reduced the time it takes us to fix data drift breakages. For example, in our previous Hadoop scenario, when we were creating the Sqoop-based processes to move data from source to destinations, we were getting the job done. That took approximately an hour to an hour and a half when we did it with Hadoop. However, with the StreamSets, since it works on a data collector-based mechanism, it completes the same process in 15 minutes of time. Therefore, it has saved us around 45 minutes per data pipeline or table that we migrate. Thus, it reduced the data transfer, including the drift part, by 45 minutes."
- "The logging mechanism could be improved. If I am working on a pipeline, then create a job out of it and it is running, it will generate constant logs. So, the logging mechanism could be simplified. Now, it is a bit difficult to understand and filter the logs. It takes some time."
What is our primary use case?
StreamSets is a wonderful data engineering, data ops tool where we can design and create data pipelines, loading on-prem data to the cloud. One of our major projects was to move data from on-premises to Azure and GCP Cloud. From there, once data is loaded, the data scientist and data analyst teams use that data to generate patterns and insights.
For a US healthcare service provider company, we designed a StreamSets pipeline to connect to relational database sources. We did generate schema from the source data loaded into Azure Data Lake Storage (ADLS) or any cloud, like S3 or GCP. This was one of our batch use cases.
With StreamSets, we have also tried to solve our real-time streaming use cases as well, where we were streaming data from source Kafka topic to Azure Event Hubs. This was a trigger-based streaming pipeline, which moved data when it appeared in a Kafka topic. Since this pipeline was a streaming pipeline, it was continuously streaming data from Kafka to Azure for further analysis.
How has it helped my organization?
We can securely fetch the passwords and credentials stored in Azure Key Vault. This is a fundamentally very strong feature that has improved our day-to-day life.
What is most valuable?
It is a pretty easy tool to use. There is no coding required. StreamSets provides us a canvas to design our pipeline. At the beginning of any project, it gives us a picture, which is an advantage. For example, if I want to do a data migration from on-premise to cloud, I will draw it for easier understanding based on my target system, and StreamSets does exactly the same thing by giving us a canvas where I can design our pipeline.
There are a wide range of available stages: various sources, relational sources, streaming sources. There are various processes like to transform the source data. It is not only to migrate data from source to destination, but we can utilize different processes to transform the data. When I was working on the healthcare project, there was personal identification information on the personal health information (PHI) data that we needed to mask. We can't simply move it from source to destination. Therefore, StreamSets provides masking of that sensitive data.
It provides us a facility to generate schema. There are different executors available, e.g., Pipeline Finisher executor, which helps us in finishing the pipeline.
There are different destinations, such as S3, Azure Data Lake, Hive, and Kafka Hadoop-based systems. There are a wide range of available stages. It supports both batch and streaming.
Scheduling is quite easy in StreamSets. From a security perspective, there is integration with keywords, e.g., for password fetching or secrets fetching.
It is pretty easy to connect to Hadoop using StreamSets. Someone just needs to be aware about the configuration details, such as which Hadoop cluster to connect and what credentials will be available. For example, if I am trying with my generic user, how do I connect with the Hadoop distributed system? Once we have the details of our cluster and the credential, we can load data to the Hadoop standalone file system. In our use case, we collected data from our RDBMS sources using JDBC Query Consumer. We queried the data from the source table, captured that data, and then loaded the data into the destination Hadoop distributed file system. Thus, configuration details are required. Once we have the configuration details, i.e., the required credentials, we can connect with Hadoop and Hive.
It takes care of data drift. There are certain data rules, matrix rules, or capabilities provided by StreamSets that we can set. So, if the source schema gets deviated somehow, StreamSets will automatically notify us or send alerts in automated fashion about what is going wrong. StreamSets also provides Change Data Capture (CDC). As soon as the source data is changed, it can capture that and update the details into the required destination.
What needs improvement?
The logging mechanism could be improved. If I am working on a pipeline, then create a job out of it and it is running, it will generate constant logs. So, the logging mechanism could be simplified. Now, it is a bit difficult to understand and filter the logs. It takes some time. For example, if I am starting with StreamSets, everything is fine. However, if I want to dig into problems that my pipeline ran into, it initially takes some time to get familiar with it and understand it.
I feel the visualization part can be simplified or enhanced a bit, so I can easily see what happened with my job seven days earlier and how many records it transmitted.
For how long have I used the solution?
I have been using StreamSets for close to four and a half years when creating my data pipelines in our projects.
What do I think about the stability of the solution?
Stability-wise, it is wonderful and quite good. Mostly, since the solution is completely cloud-based in our project, we just need to hit a URL and then we are logged into StreamSets with our credentials. Everything is present there. Other than some rare occasions, StreamSets behaves pretty well.
There were certain memory leak issues for a few stages, like Azure Data Lake, but those were corrected with immediate solutions, like patches and version upgrades.
Stability-wise, I would rate it as eight and a half or nine out of 10.
What do I think about the scalability of the solution?
I would like auto scaling for heavy load transfer. This applied particularly when we were our data migration project. The tables had more than 10 millions of records in them. When we utilized StreamSets, it took a huge amount of time. Though we were doing every schema generation, we were using ADLS as a destination, and it hung for a good amount of time. So, we considered PySpark processes for our tables, which have greater than 10 millions of records. Usually, it works pretty well with the source tables and the data size is close to five to six million records, but when it is closer to 10 million, I personally feel the auto scaling feature could be improved.
How are customer service and support?
We have spent a good amount of time dealing with their technical support team. The first step is to check the documentation, then work with them.
I had a chance to work with StreamSets during our use case. They helped us out in a good manner with a memory leak issue that we were facing in our production pipeline. So, there was one issue where our pipelines were running fine in dev and the lower environment, i.e., dev and QA, but when we moved those pipelines into production, we were getting a memory leak issue where the JVM ran out of memory exception.
We tried reducing the number of threads and the batch size for the small table, but it was still creating issues. Then, we connected with StreamSets' support team. They gave us a customized patch, which our platform team installed in our production environment. With some collaborative effort of around a week, we were finally able to run our pipeline pretty well.
I would rate the customer support and the technical support as quite good and knowledgeable (eight out of 10). They helped with issues that were occurring in our work. They accepted that there were some issues with the version, which StreamSets released and we were using. They accepted that the version particularly had some issues with the memory management. Therefore, the immediate solution that they provided was a patch, which our platform team installed. However, the long-term solution was to update or upgrade our StreamSets Data Collector platform from version 3.11 to 4.2, and that solved our problem.
How would you rate customer service and support?
Positive
Which solution did I use previously and why did I switch?
We were using Cloudera distribution. All our projects were running, utilizing Hadoop, and the distribution was Cloudera Hortonworks. We were utilizing Sqoop and Hive as well as PySpark or Scala-based processes to code. However, StreamSets helped us a lot in designing our data pipeline quickly in a very fast way.
It has made our job pretty easy in terms of designing, managing, and running our data engineering pipeline. Previously, if I needed to transfer data from source to destination, I would need to use Sqoop, which is a Hadoop stack technology used to establish connectivity with the RDBMS, then load it to the Hadoop distributed file system. With Sqoop, I needed to have my coding skills ready. I needed to be very precise about the connection details and syntax. I needed to be very aware of them. StreamSets solved this problem.
Its greatest feature is that it provides an easy way to design your pipeline. I just need to drag and drop source JDBC Query Consumer to my canvas as well as drag and drop my destination to the canvas. I then need to connect both these stages and be ready with my configuration details. As soon as I am done with that, I will validate the pipeline. I can create a job out of it and schedule it, even the monitoring. All these things can be achieved by a single control panel. So, it not only solves the developer's basic problems, but it also has greatly improved the experience.
We were previously completely using the Hadoop technology stack. Slowly, we started converting our processes into data engineering pipelines, which are designed into StreamSets. Earlier, the problem area was to write code into Sqoop or create Sqoop scripts to capture data from source, then put it into HDFS. Once data was in HDFS, we would write another PySpark process, which did the optimization and faster loading of the data, which is in Hadoop Distributed File System to a cloud-based storage data lake, like ADLS or S3. However, when StreamSets came into picture, we didn't need an intermediary, three-storage distributed file system like HDFS. We could simply create a pipeline that connects to RDBMS and load data directly to the cloud-based Azure Data Lake. So there is no requirement for an intermediary Hadoop Distributed File System (HDFS), which saves us a great amount of time and also helps us a lot in creating our data engineering pipelines.
Microsoft provided Change Data Capture tools, which one of our team members was using. Performance-wise, I personally feel StreamSets is way faster. A few of the support team members were using Informatica as well, but it does not provide powerful features that can handle big amounts of data.
How was the initial setup?
For our deployment model, we were following three environments: dev, QA and prod. Our team's main responsibility is to hydrate Azure Data Lake and GCP from the source system. Control Hub is hosted on GCP, and we were hitting the URL to log into StreamSets. All the data collector machines are created on Google Cloud Platform, and we use a dev environment. Whenever we create and do a PoC, we work in a dev environment. Once our pipeline and jobs are working fine, we move our pipelines to our QA environment, which is export and import. It is pretty easy to do via StreamSets Control Hub. We can simply select a job and export it, then log back into the QA environment and import the job. Once we import the job, the associated pipeline, and all the parameters, we have an option to import the whole bundle, like the pipeline, parameter, and instances. We can import everything. Once this is also working fine, we have another final environment, which is the production which is based on the source refresh frequencies.
What about the implementation team?
In our company, we have a good data engineering team. We have a separate administrator team who is mainly responsible for deploying it on cloud, providing us libraries whenever required. There is a separate team who is taking care of all the installations and platform-related activities. We are primarily data engineers who utilize the product for solutions.
What was our ROI?
StreamSets’ data drift resilience has reduced the time it takes us to fix data drift breakages. For example, in our previous Hadoop scenario, when we were creating the Sqoop-based processes to move data from source to destinations, we were getting the job done. That took approximately an hour to an hour and a half when we did it with Hadoop. However, with the StreamSets, since it works on a data collector-based mechanism, it completes the same process in 15 minutes of time. Therefore, it has saved us around 45 minutes per data pipeline or table that we migrate. Thus, it reduced the data transfer, including the drift part, by 45 minutes.
What's my experience with pricing, setup cost, and licensing?
StreamSets Data Collector is open source. One can utilize the StreamSets Data Collector, but the Control Hub is the main repository where all the jobs are present. Everything happens in Control Hub.
What other advice do I have?
For people who are starting out, the simple advice is to first try out the cloud login of StreamSets. It is freely available for everyone these days. StreamSets has released its online practice platform to design and create pipelines. Someone simply needs to go to cloud.login.streamsets.com, which is StreamSets official website. It is there that people who are starting out can log into StreamSets cloud and spin up their StreamSets Data Collector machines. Then, they can choose their execution mode. It is all in a Docker-containerized fashion. You don't need to do anything.
You simply need to have your laptop ready and step-by-step instructions are given. You just simply spin up your Data Collector, the execution mode, and then you are ready with the canvas. You can design your pipeline, practice, and test there. So, if you want to evaluate StreamSets in basic mode, you can take a look online. This is the easiest way to evaluate StreamSets.
It is a drag-and-drop, UI-based approach with a canvas, where you design the pipeline. It is pretty easy to follow. So, once your team feels confident, then they can purchase the StreamSets add-ons, which will provide them end-to-end solutions and vendor support. The best way is to log into their cloud practice platform and create some pipelines.
In my current project, there is a requirement to integrate with Snowflake, but I don't have Snowflake experience. I have not integrated Snowflake with StreamSets yet.
I personally love working on StreamSets. It is part of my day-to-day activities. I do a lot of work on StreamSets, so I would rate them pretty well as nine out of 10.
Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
IT Specialists at Soft Hostings
User-friendly interface and easy integration, but needs easier transformation logic and faster support
Pros and Cons
- "It's very easy to integrate. It integrates with Snowflake, AWS, Google Cloud, and Azure. It's very helpful for DevOps, DataOps, and data engineering because it provides a comprehensive solution, and it's not complicated."
- "The data collector in StreamSets has to be designed properly. For example, a simple database configuration with MySQL DB requires the MySQL Connector to be installed."
What is our primary use case?
We are sharing data between platforms. It's helping me to be independent of the ETL tools as well as have the data format without using any programming language.
How has it helped my organization?
It's helping us to be more organized. It's a tool that helps a lot in easily extracting data sets from CRM tools, and it can be integrated with external sources to make sure that you are having a good platform. It has improved our organization in the way we perform tests and the way we perform data transfers and streaming.
The data collection process is straightforward and easy. It allows us to move data into modern analytics platforms.
It allows us to build data pipelines without knowing how to code. It allows developers to make sure they are getting the correct data. It works for departments that can code and that can't code. It's a universal tool.
It's very effective. It gives you a clear understanding of the architecture of the data that you have in your company.
StreamSets’ data drift resilience saved us a lot of time. If we were taking seven days previously to build something, now it takes us three days. It has saved about 30% of the time.
It has helped to break down data silos within the organization. It helps to make sure that we are on time with data analysis. It brings efficiency. Overall, it has saved us about 25% of the time.
StreamSets’ reusable assets have helped to reduce workload. There is about a 25% workload reduction.
StreamSets saves us money by not having to hire people with specialized skills. It's saving us 300 USD every month.
StreamSets has helped to scale our data operations. In our business, we process the data the whole time, and we share it with the analytics team to identify and understand what needs to be fixed and what needs to be improved. It's good for our organization.
What is most valuable?
Its user interface is friendly. It's straightforward to implement batch, streaming, or ETL pipelines.
It's very easy to integrate. It integrates with Snowflake, AWS, Google Cloud, and Azure. It's very helpful for DevOps, DataOps, and data engineering because it provides a comprehensive solution, and it's not complicated.
What needs improvement?
When using Transformer for Snowflake, it's a bit complex to understand the transformation logic. You need someone who has some technical skills to handle it. You need to have some skills to transform the data. However, it's important that Transformer for Snowflake is a serverless engine embedded within the platform, so there is no need for maintenance. Having a serverless engine makes it easy for any enterprise to not think about or worry about the cost of maintaining the software.
The data collector in StreamSets has to be designed properly. For example, a simple database configuration with MySQL DB requires the MySQL Connector to be installed.
For how long have I used the solution?
I've been using StreamSets for three years.
What do I think about the stability of the solution?
It's very stable. It's very hard to find any downtime for the software.
What do I think about the scalability of the solution?
It's scalable enough. It integrates with AWS, Snowflake, Google Cloud, and Azure. It gives you a very good way to process and store your data.
We're using it in multiple departments in the same location. It's being used by the analytics team and our senior developers. There are about 10 people using this solution.
How are customer service and support?
They take a long time to respond to queries, but they are good people. They should improve the time to respond to queries. I'd rate them a six out of ten.
How would you rate customer service and support?
Neutral
Which solution did I use previously and why did I switch?
I didn't use any other solution previously.
How was the initial setup?
Deploying StreamSets is not so complex. It's easy. It takes about three days.
It doesn't require any maintenance from our side.
What about the implementation team?
We have an in-house team of five people.
What was our ROI?
We have seen an ROI. We use data analytics in marketing and knowing where we need to market and where we need to improve, increases our success rate. We have seen about 30% ROI.
What's my experience with pricing, setup cost, and licensing?
It's not expensive because you pay per month, and the tasks you can perform with it are huge. It's reliable and cost-effective.
What other advice do I have?
It's a very good tool if you need to access data from a CRM system, Salesforce, etc. However, it can't be used as an end-to-end integration tool because it lacks certain functionality. It could also be very expensive for small enterprises.
Overall, I'd rate it a seven out of ten.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
Software Engineer at ZIDIYO
Enables us to create streams and pipelines that our analytics team can utilize to identify areas for improvement
Pros and Cons
- "The UI is user-friendly, it doesn't require any technical know-how and we can navigate to social media or use it more easily."
- "Using ETL pipelines is a bit complicated and requires some technical aid."
What is our primary use case?
We use StreamSets to create data pipelines and to make sure that we know the exact analytics of our data usage within our company.
How has it helped my organization?
We use StreamSets' ability to connect to enterprise data stores such as Kafka. It is easy and simple to connect enterprise data stores as long as we follow the documentation.
We use StreamSets' ability to move data into the analytic platforms easily because we can use the template provided to extract data from the pipeline.
Being able to use Transformer for Snowflake to design both simple and complex transformation logic is important because it helps us break out a live amount of data interfaces that can be understood by the analytics team and identify areas of improvement. As the Transformer for Snowflake operates as a serverless engine, we can reduce our costs as we no longer need to purchase servers.
StreamSets enables us to create streams and pipelines that our analytics team can utilize to identify areas for improvement. Additionally, our marketing team can leverage the data generated from these reports to understand how we can integrate our products and services to benefit our brand.
StreamSets' data drift resilience is effective and user-friendly. We can use templates or use them from scratch. Data drift resilience saves us around 35 percent of the time fixing duplicates.
StreamSets has helped us break down data silos within our organization by providing a clear path forward and enhancing our productivity by breaking down a large amount of data that we can understand.
StreamSets saved us around 40 percent of our time.
We can use a small team using StreamSets to create data pipelines that would normally require an expert that costs around $500 per month.
StreamSets helps us scale our operations because we understand the quality of the data we have and how we can integrate the data into our marketing needs.
What is most valuable?
The UI is user-friendly, it doesn't require any technical know-how and we can navigate to social media or use it more easily.
What needs improvement?
Using ETL pipelines is a bit complicated and requires some technical aid.
The Transformer for Snowflake functionality is complex and requires a lot of logic.
For how long have I used the solution?
I have been using the solution for three years.
What do I think about the stability of the solution?
The solution is stable with no issues.
What do I think about the scalability of the solution?
The solution is scalable.
How are customer service and support?
The technical support team takes over eight hours to respond to our requests.
How would you rate customer service and support?
Neutral
How was the initial setup?
The initial setup is straightforward. I deployed the solution myself.
What about the implementation team?
The implementation was completed in-house.
What was our ROI?
StreamSets helps us increase our sales by 45 percent.
What's my experience with pricing, setup cost, and licensing?
StreamSets is expensive, especially for small businesses.
What other advice do I have?
I give the solution a nine out of ten.
The solution does not require maintenance from our end.
We have deployed StreamSets across our engineering team, data analytics team, and software development team.
StreamSets is an excellent solution for organizations that have a budget. The solution allows for various streaming capabilities and seamless integration with customer messaging, all within one environment. I highly recommend StreamSets.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
Chief software engineer at Appnomu Business Services
Enables us to build data pipelines without knowing how to code and helped us break down data silos within our organization
Pros and Cons
- "The best feature that I really like is the integration."
- "Visualization and monitoring need to be improved and refined."
What is our primary use case?
In our department, we use StreamSets to design data pipelines that load all data from various RD and VMS sources to the cloud, such as Azure. We also use the data set for data analysts to generate panels for our organization, as well as for real-time use cases for monitoring and consuming other streaming data. Additionally, we are able to customize StreamSets to suit our needs and budget.
How has it helped my organization?
Using StreamSets to create pipelines for batch streaming or ETL is easy and straightforward. However, if one is new to StreamSets, it may not be so simple and may require a lot of documentation for assistance.
We utilize StreamSets' ability to connect to enterprise data stores, making it easy to begin trading instantly without needing to be technically skilled. We use StreamSets to move data into analytics platforms. In my experience, it is initially quite easy to move data back if we have a clear understanding of data transit, importation, and exporting from external sources.
This solution enables us to build data pipelines without knowing how to code. The solution includes templates that guide us and help us customize our data easily. It is essential that StreamSets does not necessitate coding, as this saves a considerable amount of time that would otherwise be spent writing code, as well as resources that would be required to hire experts.
Transformer for Snowflake can help with both simple and complex transformation logic. For example, creating a plan to perform EPL and machine learning operations is easy and fast. However, if the same operations are performed on-site, it can be difficult to troubleshoot events due to limited visibility into the results. StreamSets' Transformer for Snowflake is important to us because it saves us a lot of time and enables us to complete a task remotely with only two or three people.
It is important that Transformer for Snowflake is a serverless engine embedded within the platform. We have the capability of creating a data operations platform, so we don't have to worry or even be aware of what we are doing at the moment. We can simply create a device and use it in the pipeline we want it to be in.
The solution improved the way we work, benefiting both our customers and our development and retainer teams. StreamSets helps us develop a platform manually, with a lot of teamwork, either remotely or on-site, depending on which option we use. This has had a significant impact on our organization in terms of how we process and transform data.
I would say that it is very easy for us to update the template so that we can have real, actual data in APL claims and in the supply chain. StreamSets' data drift resilience is very effective and can run in the data grid. The data drift resilience has reduced the time it takes us to fix data drift breakages by approximately 25 percent.
StreamSets helped us break down data silos within our organization. The ability to break down data silos helps StreamSets to gain quick insights. In general, it is a great feature that ensures we have activities or processes in place. We know precisely what to prevent and what to implement.
StreamSets saved us around 30 percent of our time, meaning that a task that would take five hours to complete manually can now be done in around three and a half hours.
The reusable assets are reducing workload by 35 percent by allowing different people to use a single platform or resource, regardless of whether they have a similar SKU or a different SKU. This feature can help an organization simplify, implement, and transmit more easily.
It is not only the cost of one packet that we paid for, but now we are implementing a strategy using different people within the company. It would be very expensive if we had to hire a new person to manage that task and it would also take a lot of time. StreamSets is not only saving us money, but it is also ensuring that we complete strategies on time.
StreamSets as well helped us scale our operations, which has had a significant impact on our business. We now have a better understanding of how to secure data and provide reliable security for the transmission of data from internal servers to external services, as well as meeting our client's application needs.
What is most valuable?
The best feature that I really like is the integration. The software can be integrated with Azure Keyvault or AWS Secrets Manager, as well as scheduling. It is very easy to schedule an event, which is much easier than I expected through StreamSets. The solution is also fast at determining pipelines. Additionally, I like that StreamSets has many components, such as sources, processes, execution, and other useful elements that I need to plan.
What needs improvement?
There should be a concept of creating double variables because it's still missing.
The loading machine mechanism needs to be simplified. Currently, it takes some time to get familiar with and understand that.
Visualization and monitoring need to be improved and refined. For example, it is difficult to monitor a job to see what happened in the past seven days when a transfer occurred.
The licensing model also has room for improvement. The solution is currently expensive.
For how long have I used the solution?
I have been using the solution for five years.
What do I think about the stability of the solution?
The solution is stable.
What do I think about the scalability of the solution?
The solution is scalable. We currently have four people using StreamSets in our organization.
How are customer service and support?
The technical support is good and they prioritize issues based on their severity, so sometimes we have to wait a while for a response.
How would you rate customer service and support?
Neutral
How was the initial setup?
The initial setup is a bit complex for first-time people. There is a lot of documentation that needs to be reviewed before deploying. The deployment takes around one month.
What about the implementation team?
The implementation is completed in-house.
What was our ROI?
StreamSets simplified our data ingestion and integration process without the need for the large financial investment that would be required if we were to use other, cheaper solutions. This is due to StreamSets' security and safety in supporting various heterogeneous sources such as RDZMS, and Salesforce. StreamSets ensures that we have a secure and easy way to launch any integration tool, resulting in increased profits. StreamSets is very stable, secure, and compliant, and has yielded a return on investment of around 30 percent.
What's my experience with pricing, setup cost, and licensing?
I believe the pricing is not equitable. Different businesses operate in various models and ways, so I wish StreamSets would be able to adjust their pricing depending on the intended use of the software. This would be beneficial to businesses with limited budgets. Currently, the cost of StreamSets is the same regardless of the amount of backup, which is costly.
What other advice do I have?
I give the solution an eight out of ten. StreamSets still needs to improve the monitoring and visualization before the solution can be a ten out of ten.
Since StreamSets is deployed in the cloud, we don't have any maintenance requirements or costs.
I highly recommend StreamSets; it is an excellent tool with both batch and streaming capabilities. StreamSets is a great option for anyone to try, though it does require an organization to have the budget to use it.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Other
Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
Buyer's Guide
Download our free StreamSets Report and get advice and tips from experienced pros
sharing their opinions.
Updated: October 2024
Product Categories
Data IntegrationPopular Comparisons
Informatica Intelligent Data Management Cloud (IDMC)
Azure Data Factory
Informatica PowerCenter
AWS Glue
MuleSoft Anypoint Platform
Oracle Data Integrator (ODI)
webMethods.io
Talend Open Studio
Confluent
IBM InfoSphere DataStage
AWS Database Migration Service
Oracle GoldenGate
SAP Data Services
Qlik Replicate
Buyer's Guide
Download our free StreamSets Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links
Learn More: Questions:
- How does Matillion ETL compare to StreamSets?
- When evaluating Data Integration, what aspect do you think is the most important to look for?
- Microsoft SSIS vs. Informatica PowerCenter - which solution has better features?
- What are the best on-prem ETL tools?
- Which integration solution is best for a company that wants to integrate systems between sales, marketing, and project development operations systems?
- Experiences with Oracle GoldenGate vs. Oracle Data Integrator?
- What are the must-have features for a Data integration system?
- Should we choose Data Hub or GoldenGate?
- Is there a bulletproof KPI Data Manager for SME?
- A recent review wrote that PowerCenter has room for improvement. Agree or Disagree?