Try our new research platform with insights from 80,000+ expert users

Matillion ETL vs StreamSets comparison

 

Comparison Buyer's Guide

Executive Summary
 

Categories and Ranking

Matillion ETL
Average Rating
8.4
Number of Reviews
26
Ranking in other categories
Cloud Data Integration (5th)
StreamSets
Average Rating
8.4
Reviews Sentiment
7.5
Number of Reviews
24
Ranking in other categories
Data Integration (9th)
 

Featured Reviews

AntonHaupt - PeerSpot reviewer
Efficient data integration and transformation with seamless cloud-native integration
In our small business unit, we currently have around four users, with two of them utilizing Matillion within our organization. Considering our growing needs, we're contemplating transitioning to an enterprise SaaS solution where we would share the same instance. Currently, each user is billed individually, but consolidating to a shared instance seems more efficient. Scalability is excellent when using the SaaS solution, easily reaching a rating of ten out of ten. Each data pipeline request is encapsulated within a Docker container and spun off, allowing for instant scalability. Overall, I would rate it a nine out of ten in terms of performance and scalability.
Reyansh Kumar - PeerSpot reviewer
We no longer need to hire highly skilled data engineers to create and monitor data pipelines
The things I like about StreamSets are its * overall user interface * efficiency * product features, which are all good. Also, the scheduling within the data engineering pipeline is very much appreciated, and it has a wide range of connectors for connecting to any data sources like SQL Server, AWS, Azure, etc. We have used it with Kafka, Hadoop, and Azure Data Factory Datasets. Connecting to these systems with StreamSets is very easy. You just need to configure the data sources, the paths and their configurations, and you are ready to go. It is very efficient and very easy to use for ETL pipelines. It is a GUI-based interface in which you can easily create or design your own data pipelines with just a few clicks. As for moving data into modern analytics systems, we are using it with Microsoft Power BI, AWS, and some on-premises solutions, and it is very easy to get data from StreamSets into them. No hardcore coding or special technical expertise is required. It is also a no-code platform in which you can configure your data sources and data output for easy configuration of your data pipeline. This is a very important aspect because if a tool requires code development, we need to hire software developers to get the task done. By using StreamSets, it can be done with a few clicks.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"The new version with the Productivity Cloud is very simple. It's easy to use, navigate, and understand."
"The tool's middle-dimensional structure significantly simplifies obtaining the right data at the appropriate level. This feature makes deploying our applications easier since we utilize a single source without publishing data from various sources."
"The product is quite stable and can handle complex data integration tasks well."
"The most valuable feature of Matillion ETL is the ETL. The solution is open-source which provides advantages, such as good performance and high efficiency. Additionally, it supports three data types which eliminates predefining the data, and we can write script models in Python."
"It takes less than five minutes to set up and delivers results. It is much quicker than traditional ETL technologies."
"The most valuable feature of Matillion ETL is its ease of use. If you have had some experience with other solutions, such as Snowflake, the use of this solution will be simple."
"The loading of data is the most valuable feature of Matillion ETL."
"The simplicity of this tool is nice. It has a good graphical user interface. You can also do a lot of generic stuff in the tool. If there is good connectivity to a cloud database, such as Snowflake, and you can have a lot of Snowflake functionality in the tool."
"The Ease of configuration for pipes is amazing. It has a lot of connectors. Mainly, we can do everything with the data in the pipe. I really like the graphical interface too"
"StreamSets Transformer is a good feature because it helps you when you are developing applications and when you don't want to write a lot of code. That is the best feature overall."
"I really appreciate the numerous ready connectors available on both the source and target sides, the support for various media file formats, and the ease of configuring and managing pipelines centrally."
"StreamSets data drift feature gives us an alert upfront so we know that the data can be ingested. Whatever the schema or data type changes, it lands automatically into the data lake without any intervention from us, but then that information is crucial to fix for downstream pipelines, which process the data into models, like Tableau and Power BI models. This is actually very useful for us. We are already seeing benefits. Our pipelines used to break when there were data drift changes, then we needed to spend about a week fixing it. Right now, we are saving one to two weeks. Though, it depends on the complexity of the pipeline, we are definitely seeing a lot of time being saved."
"Important features include that it comprises lots of functionality to connect data from various sources through connector availability, scheduling pipelines at any time, and integration with third-party and security solutions for encryption."
"The most valuable would be the GUI platform that I saw. I first saw it at a special session that StreamSets provided towards the end of the summer. I saw the way you set it up and how you have different processes going on with your data. The design experience seemed to be pretty straightforward to me in terms of how you drag and drop these nodes and connect them with arrows."
"The most valuable features are the option of integration with a variety of protocols, languages, and origins."
"The most valuable feature is the pipelines because they enable us to pull in and push out data from different sources and to manipulate and clean things up within them."
 

Cons

"I am looking forward to seeing the expansion of the source range for their data loader product."
"One of the features that's in development is data privacy in the cloud, along with further SAP integration. For connectivity to SAP systems."
"Matillion’s on-premises capabilities don’t allow you to build something customized."
"There are certain functions that are available in other ETL tools which are still not present in Matillion ETL. It would be good to have more features."
"The current version is a bit more limited because it's on a virtual machine, and everything executes on that one virtual machine."
"While the UI is good, it could be improved in its efficiency and made easier to use."
"Ideally, I would like it to integrate with Secrets Manager as well as the AWS."
"Our main challenge currently is that Matillion runs on an EC2 instance, limiting us to running only two processes simultaneously at the entry level."
"If you use JDBC Lookup, for example, it generally takes a long time to process data."
"One thing that I would like to add is the ability to manually enter data. The way the solution currently works is we don't have the option to manually change the data at any point in time. Being able to do that will allow us to do everything that we want to do with our data. Sometimes, we need to manually manipulate the data to make it more accurate in case our prior bifurcation filters are not good. If we have the option to manually enter the data or make the exact iterations on the data set, that would be a good thing."
"I would like to see further improvement in the UI. In addition, upgrades are not automatic and they should be automated. Currently, we have to manually upgrade versions."
"The execution engine could be improved. When I was at their session, they were using some obscure platform to run. There is a controller, which controls what happens on that, but you should be able to easily do this at any of the cloud services, such as Google Cloud. You shouldn't have any issues in terms of how to run it with their online development platform or design platform, basically their execution engine. There are issues with that."
"One area for improvement could be the cloud storage server speed, as we have faced some latency issues here and there."
"The data collector in StreamSets has to be designed properly. For example, a simple database configuration with MySQL DB requires the MySQL Connector to be installed."
"Currently, we can only use the query to read data from SAP HANA. What we would like to see, as soon as possible, is the ability to read from multiple tables from SAP HANA. That would be a really good thing that we could use immediately. For example, if you have 100 tables in SQL Server or Oracle, then you could just point it to the schema or the 100 tables and ingestion information. However, you can't do that in SAP HANA since StreamSets currently is lacking in this. They do not have a multi-table feature for SAP HANA. Therefore, a multi-table origin for SAP HANA would be helpful."
"Using ETL pipelines is a bit complicated and requires some technical aid."
 

Pricing and Cost Advice

"Matillion ETL is expensive."
"The cost of the solution is high and could be reduced."
"The product must improve its pricing."
"The price of Matillion ETL is reasonable."
"It was procured through the AWS Marketplace because it keeps things simple. They offer retail-like checkout and bill through your existing Amazon Web Services account."
"Its price depends on what you expect. You pay on a monthly basis, but there is a possibility to have special contracts depending on the installation."
"The AWS pricing and licensing are a cost-effective solution for data integration needs."
"It is not necessarily a cheap solution. However, it's reasonable priced, especially with the smaller machines that we run it on."
"StreamSets is an expensive solution."
"We are running the community version right now, which can be used free of charge."
"The licensing is expensive, and there are other costs involved too. I know from using the software that you have to buy new features whenever there are new updates, which I don't really like. But initially, it was very good."
"The pricing is affordable for any business."
"The overall cost is very flexible so it is not a burden for our organization... However, the cost should be improved. For small and mid-size organizations it might be a challenge."
"It's not so favorable for small companies."
"I believe the pricing is not equitable."
"StreamSets Data Collector is open source. One can utilize the StreamSets Data Collector, but the Control Hub is the main repository where all the jobs are present. Everything happens in Control Hub."
report
Use our free recommendation engine to learn which Cloud Data Integration solutions are best for your needs.
816,406 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
16%
Computer Software Company
14%
Manufacturing Company
9%
Government
6%
Financial Services Firm
17%
Computer Software Company
13%
Manufacturing Company
8%
Insurance Company
6%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

What do you like most about Matillion ETL?
The new version with the Productivity Cloud is very simple. It's easy to use, navigate, and understand.
What is your experience regarding pricing and costs for Matillion ETL?
The solution's pricing is not based on the licensing cost but on the running hours when the Matillion instance is up and running. Its pricing model is different from the traditional pricing models ...
What needs improvement with Matillion ETL?
Depending on the use case, the solution's pricing could be improved. Matillion ETL should include more enhanced capabilities for extracting data from the SAP systems.
What do you like most about StreamSets?
The best thing about StreamSets is its plugins, which are very useful and work well with almost every data source. It's also easy to use, especially if you're comfortable with SQL. You can customiz...
What needs improvement with StreamSets?
We often faced problems, especially with SAP ERP. We struggled because many columns weren't integers or primary keys, which StreamSets couldn't handle. We had to restructure our data tables, which ...
What is your primary use case for StreamSets?
StreamSets is used for data transformation rather than ETL processes. It focuses on transforming data directly from sources without handling the extraction part of the process. The transformed data...
 

Comparisons

 

Also Known As

Matillion ETL for Redshift, Matillion ETL for Snowflake, Matillion ETL for BigQuery
No data available
 

Learn More

Video not available
 

Overview

 

Sample Customers

Thrive Market, MarketBot, PWC, Axtria, Field Nation, GE, Superdry, Quantcast, Lightbox, EDF Energy, Finn Air, IPRO, Twist, Penn National Gaming Inc
Availity, BT Group, Humana, Deluxe, GSK, RingCentral, IBM, Shell, SamTrans, State of Ohio, TalentFulfilled, TechBridge
Find out what your peers are saying about Matillion ETL vs. StreamSets and other solutions. Updated: October 2024.
816,406 professionals have used our research since 2012.