What is our primary use case?
It is a good tool for us. All the implementation in our company is done with AWS Glue. We use it to execute all the ETL processes. We have collected more or less five terabytes of information from the internet by now. We process all this data in our cloud platform and normalize the information. We first put it on a data lake that we have here on the AWS tool. After that, we use AWS Glue to transform all the information collected around the internet and put the normalized information into a data warehouse.
How has it helped my organization?
It has improved the time to implement a new ETL process by 30%. We have also seen a big improvement in the data science area.
What is most valuable?
The facility to integrate with S3 and the possibility to use Jupyter Notebook inside the pipeline are the most valuable features.
What needs improvement?
The crucial problem with AWS Glue is that it only works with AWS. It is not an agnostic tool like Pentaho. In PowerCenter, we can install the forms from Google and other vendors, but in the case of AWS Glue, we can only use AWS.
For how long have I used the solution?
I have been using this solution for two years.
What do I think about the stability of the solution?
In terms of stability, we had some problems in the past, but now, it is okay. AWS provides SLA, and the integration of the tools is good.
What do I think about the scalability of the solution?
Scalability is a very strong point of this solution as compared to other solutions like PowerCenter and Pentaho. In Pentaho, you need to install a lot of machines, but in AWS Glue, you just need to find out how many instances do you need. You just put this information in a form and click okay. Magically, you have the scaled processes.
We have 35 users of this solution, and they are engineers, DevOps, and data scientists. We have a lot of plans to increase the usage of AWS Glue in 2021.
How are customer service and technical support?
In the first year of using it, we had a lot of problems with the solution. Our team found more or less five bugs if I remember correctly. Our experience with AWS support was very good. The team in the US helped us to resolve the problems and fix the bugs. We are AWS partners.
Which solution did I use previously and why did I switch?
Before AWS Glue, we worked with Talend, PowerCenter, and Pentaho. In the case of PowerCenter, the biggest problem for us was the plugins because they were too expensive. That was the negative point of PowerCenter.
In the case of Talend, the problem was that in Brazil, we didn't have professionals with the skills to work with Talend. In addition, we had to use the command-line interface, which was a terrible thing because it took more time as compared to other solutions.
In the case of Pentaho, we had the same problem as Talend. We didn't have a lot of professionals. Of course, we have some courses to train people in Pentaho. We work with the biggest companies in Brazil, and we need professionals every day, but we don't have professionals with experience in Pentaho.
How was the initial setup?
The initial setup process is totally easy. You just need to put some information in the forms, and then you just need to click some buttons, and it is complete. The process to provide a new infrastructure with AWS Glue takes from 10 minutes to an hour.
What about the implementation team?
We have all the professionals inside the company.
What's my experience with pricing, setup cost, and licensing?
Its price is good. We pay as we go or based on the usage, which is a good thing for us because it is simple to forecast for the tool. It is also good in terms of the financial planning of the company, and it is a good way to estimate the cost. It is also simple for our clients.
In my opinion, it is one of the best tools in the market for ETL processes because of the fact that you pay as you use, which separates it from other big tools such as PowerCenter, Pentaho Data Integration, and Talend.
What other advice do I have?
I would rate AWS Glue a seven out of ten.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner