Try our new research platform with insights from 80,000+ expert users

Azure Data Factory vs StreamSets comparison

 

Comparison Buyer's Guide

Executive SummaryUpdated on Dec 19, 2024

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Categories and Ranking

Azure Data Factory
Ranking in Data Integration
1st
Average Rating
8.0
Reviews Sentiment
6.9
Number of Reviews
89
Ranking in other categories
Cloud Data Warehouse (3rd)
StreamSets
Ranking in Data Integration
9th
Average Rating
8.4
Reviews Sentiment
7.1
Number of Reviews
21
Ranking in other categories
No ranking in other categories
 

Mindshare comparison

As of January 2025, in the Data Integration category, the mindshare of Azure Data Factory is 10.8%, down from 13.4% compared to the previous year. The mindshare of StreamSets is 1.9%, up from 1.4% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Data Integration
 

Featured Reviews

Joy Maitra - PeerSpot reviewer
Facilitates seamless data pipeline creation with good analytics and and thorough monitoring
Azure Data Factory is a low code, no code platform, which is helpful. It provides many prebuilt functionalities that assist in building data pipelines. Also, it facilitates easy transformation with all required functionalities for analytics. Furthermore, it connects to different sources out-of-the-box, making integration much easier. The monitoring is very thorough, though a more readable version would be appreciable.
Reyansh Kumar - PeerSpot reviewer
We no longer need to hire highly skilled data engineers to create and monitor data pipelines
The things I like about StreamSets are its * overall user interface * efficiency * product features, which are all good. Also, the scheduling within the data engineering pipeline is very much appreciated, and it has a wide range of connectors for connecting to any data sources like SQL Server, AWS, Azure, etc. We have used it with Kafka, Hadoop, and Azure Data Factory Datasets. Connecting to these systems with StreamSets is very easy. You just need to configure the data sources, the paths and their configurations, and you are ready to go. It is very efficient and very easy to use for ETL pipelines. It is a GUI-based interface in which you can easily create or design your own data pipelines with just a few clicks. As for moving data into modern analytics systems, we are using it with Microsoft Power BI, AWS, and some on-premises solutions, and it is very easy to get data from StreamSets into them. No hardcore coding or special technical expertise is required. It is also a no-code platform in which you can configure your data sources and data output for easy configuration of your data pipeline. This is a very important aspect because if a tool requires code development, we need to hire software developers to get the task done. By using StreamSets, it can be done with a few clicks.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"From my experience so far, the best feature is the ability to copy data to any environment. We have 100 connects and we can connect them to the system and copy the data from its respective system to any environment. That is the best feature."
"The most valuable features of Azure Data Factory are the flexibility, ability to move data at scale, and the integrations with different Azure components."
"What I like best about Azure Data Factory is that it allows you to create pipelines, specifically ETL pipelines. I also like that Azure Data Factory has connectors and solves most of my company's problems."
"This solution has provided us with an easier, and more efficient way to carry out data migration tasks."
"The user interface is very good. It makes me feel very comfortable when I am using the tool."
"Data Factory's best features are simplicity and flexibility."
"It's cloud-based, allowing multiple users to easily access the solution from the office or remote locations. I like that we can set up the security protocols for IP addresses, like allow lists. It's a pretty user-friendly product as well. The interface and build environment where you create pipelines are easy to use. It's straightforward to manage the digital transformation pipelines we build."
"It is easy to deploy workflows and schedule jobs."
"The ETL capabilities are very useful for us. We extract and transform data from multiple data sources, into a single, consistent data store, and then we put it in our systems. We typically use it to connect our Apache Kafka with data lakes. That process is smooth and saves us a lot of time in our production systems."
"The most valuable features are the option of integration with a variety of protocols, languages, and origins."
"StreamSets Transformer is a good feature because it helps you when you are developing applications and when you don't want to write a lot of code. That is the best feature overall."
"The Ease of configuration for pipes is amazing. It has a lot of connectors. Mainly, we can do everything with the data in the pipe. I really like the graphical interface too"
"The scheduling within the data engineering pipeline is very much appreciated, and it has a wide range of connectors for connecting to any data sources like SQL Server, AWS, Azure, etc. We have used it with Kafka, Hadoop, and Azure Data Factory Datasets. Connecting to these systems with StreamSets is very easy."
"What I love the most is that StreamSets is very light. It's a containerized application. It's easy to use with Docker. If you are a large organization, it's very easy to use Kubernetes."
"I really appreciate the numerous ready connectors available on both the source and target sides, the support for various media file formats, and the ease of configuring and managing pipelines centrally."
"The most valuable feature is the pipelines because they enable us to pull in and push out data from different sources and to manipulate and clean things up within them."
 

Cons

"On the UI side, they could make it a little more intuitive in terms of how to add the radius components. Somebody who has been working with tools like Informatica or DataStage gets very used to how the UI looks and feels."
"Some prebuilt data source or data connection aspects are generic."
"While it has a range of connectors for various systems, such as ERP systems, the support for these connectors can be lacking."
"It's a good idea to take a Microsoft course. Because they are really helpful when you start from your journey with Data Factory."
"Azure Data Factory should be cheaper to move data to a data center abroad for calamities in case of disasters."
"The product could provide more ways to import and export data."
"The pricing model should be more transparent and available online."
"The thing we missed most was data update, but this is now available as of two weeks ago."
"One area for improvement could be the cloud storage server speed, as we have faced some latency issues here and there."
"Using ETL pipelines is a bit complicated and requires some technical aid."
"One thing that I would like to add is the ability to manually enter data. The way the solution currently works is we don't have the option to manually change the data at any point in time. Being able to do that will allow us to do everything that we want to do with our data. Sometimes, we need to manually manipulate the data to make it more accurate in case our prior bifurcation filters are not good. If we have the option to manually enter the data or make the exact iterations on the data set, that would be a good thing."
"Currently, we can only use the query to read data from SAP HANA. What we would like to see, as soon as possible, is the ability to read from multiple tables from SAP HANA. That would be a really good thing that we could use immediately. For example, if you have 100 tables in SQL Server or Oracle, then you could just point it to the schema or the 100 tables and ingestion information. However, you can't do that in SAP HANA since StreamSets currently is lacking in this. They do not have a multi-table feature for SAP HANA. Therefore, a multi-table origin for SAP HANA would be helpful."
"The software is very good overall. Areas for improvement are the error logging and the version history. I would like to see better, more detailed error logging information."
"In terms of the product, I don't think there is any room for improvement because it is very good. One small area of improvement that is very much needed is on the knowledge base side. Sometimes, it is not very clear how to set up a certain process or a certain node for a person who's using the platform for the first time."
"I would like to see it integrate with other kinds of platforms, other than Java. We're going to have a lot of applications using .NET and other languages or frameworks. StreamSets is very helpful for the old Java platform but it's hard to integrate with the other platforms and frameworks."
"StreamSets should provide a mechanism to be able to perform data quality assessment when the data is being moved from one source to the target."
 

Pricing and Cost Advice

"I rate the product price as six on a scale of one to ten, where one is low price and ten is high price."
"Our licensing fees are approximately 15,000 ($150 USD) per month."
"Understanding the pricing model for Data Factory is quite complex."
"There's no licensing for Azure Data Factory, they have a consumption payment model. How often you are running the service and how long that service takes to run. The price can be approximately $500 to $1,000 per month but depends on the scaling."
"The solution's fees are based on a pay-per-minute use plus the amount of data required to process."
"The price is fair."
"ADF is cheaper compared to AWS."
"Pricing is comparable, it's somewhere in the middle."
"The licensing is expensive, and there are other costs involved too. I know from using the software that you have to buy new features whenever there are new updates, which I don't really like. But initially, it was very good."
"StreamSets Data Collector is open source. One can utilize the StreamSets Data Collector, but the Control Hub is the main repository where all the jobs are present. Everything happens in Control Hub."
"The pricing is too fixed. It should be based on how much data you need to process. Some businesses are not so big that they process a lot of data."
"We are running the community version right now, which can be used free of charge."
"We use the free version. It's great for a public, free release. Our stance is that the paid support model is too expensive to get into. They should honestly reevaluate that."
"The overall cost is very flexible so it is not a burden for our organization... However, the cost should be improved. For small and mid-size organizations it might be a challenge."
"The pricing is affordable for any business."
"It has a CPU core-based licensing, which works for us and is quite good."
report
Use our free recommendation engine to learn which Data Integration solutions are best for your needs.
831,265 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
13%
Computer Software Company
12%
Manufacturing Company
9%
Healthcare Company
7%
Financial Services Firm
18%
Computer Software Company
11%
Manufacturing Company
11%
Insurance Company
7%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

How do you select the right cloud ETL tool?
AWS Glue and Azure Data factory for ELT best performance cloud services.
How does Azure Data Factory compare with Informatica PowerCenter?
Azure Data Factory is flexible, modular, and works well. In terms of cost, it is not too pricey. It offers the stability and reliability I am looking for, good scalability, and is easy to set up an...
How does Azure Data Factory compare with Informatica Cloud Data Integration?
Azure Data Factory is a solid product offering many transformation functions; It has pre-load and post-load transformations, allowing users to apply transformations either in code by using Power Q...
What do you like most about StreamSets?
The best thing about StreamSets is its plugins, which are very useful and work well with almost every data source. It's also easy to use, especially if you're comfortable with SQL. You can customiz...
What needs improvement with StreamSets?
We often faced problems, especially with SAP ERP. We struggled because many columns weren't integers or primary keys, which StreamSets couldn't handle. We had to restructure our data tables, which ...
What is your primary use case for StreamSets?
StreamSets is used for data transformation rather than ETL processes. It focuses on transforming data directly from sources without handling the extraction part of the process. The transformed data...
 

Learn More

Video not available
 

Overview

 

Sample Customers

1. Adobe 2. BMW 3. Coca-Cola 4. General Electric 5. Johnson & Johnson 6. LinkedIn 7. Mastercard 8. Nestle 9. Pfizer 10. Samsung 11. Siemens 12. Toyota 13. Unilever 14. Verizon 15. Walmart 16. Accenture 17. American Express 18. AT&T 19. Bank of America 20. Cisco 21. Deloitte 22. ExxonMobil 23. Ford 24. General Motors 25. IBM 26. JPMorgan Chase 27. Microsoft (Azure Data Factory is developed by Microsoft) 28. Oracle 29. Procter & Gamble 30. Salesforce 31. Shell 32. Visa
Availity, BT Group, Humana, Deluxe, GSK, RingCentral, IBM, Shell, SamTrans, State of Ohio, TalentFulfilled, TechBridge
Find out what your peers are saying about Azure Data Factory vs. StreamSets and other solutions. Updated: January 2025.
831,265 professionals have used our research since 2012.