We performed a comparison between Azure Data Factory and StreamSets based on real PeerSpot user reviews.
Find out in this report how the two Data Integration solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI."The most valuable features are data transformations."
"The security of the agent that is installed on-premises is very good."
"This solution has provided us with an easier, and more efficient way to carry out data migration tasks."
"I like how you can create your own pipeline in your space and reuse those creations. You can collaborate with other people who want to use your code."
"The tool's most valuable features are its connectors. It has many out-of-the-box connectors. We use ADF for ETL processes. Our main use case involves integrating data from various databases, processing it, and loading it into the target database. ADF plays a crucial role in orchestrating these ETL workflows."
"One of the most valuable features of Azure Data Factory is the drag-and-drop interface. This helps with workflow management because we can just drag any tables or data sources we need. Because of how easy it is to drag and drop, we can deliver things very quickly. It's more customizable through visual effect."
"The workflow automation features in GitLab, particularly its low code/no code approach, are highly beneficial for accelerating development speed. This feature allows for quick creation of pipelines and offers customization options for integration needs, making it versatile for various use cases. GitLab supports a wide range of connectors, catering to a majority of integration needs. Azure Data Factory's virtual enterprise and monitoring capabilities, the visual interface of GitLab makes it user-friendly and easy to teach, facilitating adoption within teams. While the monitoring capabilities are sufficient out of the box, they may not be as comprehensive as dedicated enterprise monitoring tools. GitLab's monitoring features are manageable for production use, with the option to integrate log analytics or create custom dashboards if needed. The data flow feature in Azure Data Factory within GitLab is valuable for data transformation tasks, especially for those who may not have expertise in writing complex code. It simplifies the process of data manipulation and is particularly useful for individuals unfamiliar with Spark coding. While there could be improvements for more flexibility, overall, the data flow feature effectively accomplishes its purpose within GitLab's ecosystem."
"The data factory agent is quite good and programming or defining the value of jobs, processes, and activities is easy."
"The most valuable features are the option of integration with a variety of protocols, languages, and origins."
"In StreamSets, everything is in one place."
"The scheduling within the data engineering pipeline is very much appreciated, and it has a wide range of connectors for connecting to any data sources like SQL Server, AWS, Azure, etc. We have used it with Kafka, Hadoop, and Azure Data Factory Datasets. Connecting to these systems with StreamSets is very easy."
"The ability to have a good bifurcation rate and fewer mistakes is valuable."
"The ETL capabilities are very useful for us. We extract and transform data from multiple data sources, into a single, consistent data store, and then we put it in our systems. We typically use it to connect our Apache Kafka with data lakes. That process is smooth and saves us a lot of time in our production systems."
"StreamSets Transformer is a good feature because it helps you when you are developing applications and when you don't want to write a lot of code. That is the best feature overall."
"It's very easy to integrate. It integrates with Snowflake, AWS, Google Cloud, and Azure. It's very helpful for DevOps, DataOps, and data engineering because it provides a comprehensive solution, and it's not complicated."
"For me, the most valuable features in StreamSets have to be the Data Collector and Control Hub, but especially the Data Collector. That feature is very elegant and seamlessly works with numerous source systems."
"The support and the documentation can be improved."
"Azure Data Factory should be cheaper to move data to a data center abroad for calamities in case of disasters."
"This solution is currently only useful for basic data movement and file extractions, which we would like to see developed to handle more complex data transformations."
"Areas for improvement in Azure Data Factory include connectivity and integration. When you use integration runtime, whenever there's a failure, the backup process in Azure Data Factory takes time, so this is another area for improvement."
"We require Azure Data Factory to be able to connect to Google Analytics."
"There's no Oracle connector if you want to do transformation using data flow activity, so Azure Data Factory needs more connectors for data flow transformation."
"You cannot use a custom data delimiter, which means that you have problems receiving data in certain formats."
"The pricing model should be more transparent and available online."
"The documentation is inadequate and has room for improvement because the technical support does not regularly update their documentation or the knowledge base."
"Sometimes, it is not clear at first how to set up nodes. A site with an explanation of how each node works would be very helpful."
"We've seen a couple of cases where it appears to have a memory leak or a similar problem."
"I would like to see it integrate with other kinds of platforms, other than Java. We're going to have a lot of applications using .NET and other languages or frameworks. StreamSets is very helpful for the old Java platform but it's hard to integrate with the other platforms and frameworks."
"I would like to see further improvement in the UI. In addition, upgrades are not automatic and they should be automated. Currently, we have to manually upgrade versions."
"Sometimes, when we have large amounts of data that is very efficiently stored in Hadoop or Kafka, it is not very efficient to run it through StreamSets, due to the lack of efficiency or the resources that StreamSets is using."
"One thing that I would like to add is the ability to manually enter data. The way the solution currently works is we don't have the option to manually change the data at any point in time. Being able to do that will allow us to do everything that we want to do with our data. Sometimes, we need to manually manipulate the data to make it more accurate in case our prior bifurcation filters are not good. If we have the option to manually enter the data or make the exact iterations on the data set, that would be a good thing."
"Visualization and monitoring need to be improved and refined."
Azure Data Factory is ranked 1st in Data Integration with 81 reviews while StreamSets is ranked 8th in Data Integration with 24 reviews. Azure Data Factory is rated 8.0, while StreamSets is rated 8.4. The top reviewer of Azure Data Factory writes "The data factory agent is quite good but pricing needs to be more transparent". On the other hand, the top reviewer of StreamSets writes "We no longer need to hire highly skilled data engineers to create and monitor data pipelines". Azure Data Factory is most compared with Informatica PowerCenter, Informatica Cloud Data Integration, Alteryx Designer, Snowflake and IBM InfoSphere DataStage, whereas StreamSets is most compared with Fivetran, Informatica PowerCenter, SSIS, IBM InfoSphere DataStage and webMethods.io Integration. See our Azure Data Factory vs. StreamSets report.
See our list of best Data Integration vendors.
We monitor all Data Integration reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.