We use AWS Glue for data analytics.
Sr Associate at Cognizant
A stable and easy-to-use solution that can be used for data analytics
Pros and Cons
- "AWS Glue is a stable and easy-to-use solution."
- "The solution’s stability could be improved."
What is our primary use case?
What is most valuable?
AWS Glue is a stable and easy-to-use solution.
What needs improvement?
The solution’s stability could be improved.
For how long have I used the solution?
I have been using AWS Glue for the last three years.
Buyer's Guide
AWS Glue
January 2025
Learn what your peers think about AWS Glue. Get advice and tips from experienced pros sharing their opinions. Updated: January 2025.
831,265 professionals have used our research since 2012.
What do I think about the stability of the solution?
I rate AWS Glue a seven out of ten for stability.
What do I think about the scalability of the solution?
AWS Glue is a very scalable solution, and you can connect multiple databases.
How are customer service and support?
AWS Glue's technical support is very good.
What's my experience with pricing, setup cost, and licensing?
AWS Glue is not a licensed solution. AWS Glue follows a pay-as-you-go model, wherein the cost of the data you use will be counted as a monthly bill.
What other advice do I have?
Currently, there are many ETL tools in the marketplace. Compared to other ETL tools, AWS Glue is a low-cost and serverless solution.
Overall, I rate AWS Glue a nine out of ten.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Project Manager at Softway
It has a real-time backup feature and records and backs up information every single moment, but its cost is high, and setting it up is complex
Pros and Cons
- "What I like best about AWS Glue is its real-time data backup feature. Last week, there was a production push, and what used to take almost ten days to send out around fifty-six thousand emails now takes only two hours."
- "Cost-wise, AWS Glue is expensive, so that's an area for improvement. The process for setting up the solution was also complex, which is another area for improvement."
What is our primary use case?
We're using GPU 0.2 in ten verticals and wanted to use AWS Glue only for one purpose: to optimize Amazon Redshift.
We have millions of data that we have to back up. Previously, we did it once every six months, but the client data have been very interactive, and we need spontaneous back and forth of data communication in real-time. In one second, we have almost one million records that come and go continuously. The client wanted to keep all data because they're using it for analytics and wanted to back up the data every second without delay. We tried to optimize Amazon Redshift and found out about AWS Glue, which comes with massive costs, but the client is willing to pay.
What is most valuable?
What I like best about AWS Glue is its real-time data backup feature. Last week, there was a production push, and what used to take almost ten days even to send out around fifty-six thousand emails now takes only two hours.
I also like that the data backup in AWS Glue is spontaneous, and data is recorded and backed up every single moment.
What needs improvement?
AWS Glue had some issues, which required optimization, particularly in terms of the number of workers you deploy, and that's where costing comes in. Cost-wise, AWS Glue is expensive, so that's an area for improvement. My company did some modifications, which turned out to be successful, so overall, the solution works fine.
Even though there is a backup, you need to know what's happening. You need to understand why there's a failure. AWS Glue doesn't provide the information, so my company uses its logs. The development team also doesn't have specific answers because the team is still playing around with the process, which means the company is still trying to figure out other areas for improvement in AWS Glue.
The process for setting up the solution was also complex, which is another area for improvement.
AWS should provide help during migration and assist its users. Otherwise, it's a nightmare.
For how long have I used the solution?
I've been using AWS Glue for one and a half months.
What do I think about the stability of the solution?
AWS Glue is stable, but stability depends on how many workers you deploy and the work that you do.
What do I think about the scalability of the solution?
AWS Glue is highly scalable. It can scale to almost one billion data per second.
How are customer service and support?
We did make some good friends in AWS, so they gave us technical support for AWS Glue for free. They were also new and were trying to evolve, so they provided us with free support, but they'll be charging other clients for the support moving forward.
How was the initial setup?
The setup for AWS Glue is highly complex. The company started with R&D four months ago and only completed the deployment last week.
My company used one and a half FTE resources for the deployment.
The deployment process for AWS Glue was normal and involved CI/CD, but it was mainly the backend dev ops engineers who did it. I'm more of a project manager, so I'm not involved in technical items. It's more of me helping the engineers with the R&D.
What's my experience with pricing, setup cost, and licensing?
AWS Glue is a high-priced solution that bills the client $150,000 to $250,000 annually. That's just the starting price because it's a small data sample, but if it hits over three hundred million users, the cost will probably go up almost thirty times more.
What other advice do I have?
I'm using the latest version of AWS Glue.
I'm not the end-user, as I work for a company that implements AWS Glue for clients.
My company has one client using AWS Glue, but that client has three hundred million users.
I recommend AWS Glue to others because it's an excellent solution. However, it lacks documentation. There's only a little documentation available. Even certified AWS practitioners struggle with the lack of documentation for AWS Glue. You'll find complicated processes or features, such as time series tables. Even if there's documentation, implementing the solution requires many trial and error methods, and revamping becomes a nightmare if you're using the old infrastructure.
My rating for AWS Glue is seven out of ten because of the complexity of the deployment, and the lack of information and documentation, that my company had to do some R&D. If AWS had complete documentation, or sent more than one person to assist my company, then it could have saved more time.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Disclosure: My company has a business relationship with this vendor other than being a customer: Implementer
Buyer's Guide
AWS Glue
January 2025
Learn what your peers think about AWS Glue. Get advice and tips from experienced pros sharing their opinions. Updated: January 2025.
831,265 professionals have used our research since 2012.
CEO - Founder / Principal Data Scientist / Principal AI Architect at Kanayma LLC
Straightforward, easy to set up, and needs little to no training
Pros and Cons
- "It's fairly straightforward as a product; it's not very complicated."
- "The mapping area and the use of the data catalog from Glue could be better."
What is our primary use case?
We use the solution to do the usual type of transformations that before required ETL. It's mostly transformation-type purposes that we have, including transforming data from source to target. Also, we are replacing the usual ETLs with Glue, for example.
How has it helped my organization?
The solution has helped the organization mostly with migration. When we migrate from mostly on-premises to the cloud, Glue replaces a more expensive item that's also available in the cloud. Lots of programmers can understand Glue because they can write scripts in Python and PySpark, and there are quite a few programmers that know how to program in those languages. With the previous components that did similar things on-premise, you needed specialized knowledge.
What is most valuable?
The fact that we can use PySpark to program them is great.
It's fairly straightforward as a product; it's not very complicated.
The fact that Amazon offers it and it's a quick way of getting things done, and the fact that it doesn't require a lot of training is very positive attributes.
The initial setup is straightforward.
It can scale.
What needs improvement?
The mapping area and the use of the data catalog from Glue could be better. I would say those two are the main things we'd like to see improvements on.
The solution needs support for big data.
As I understand it, Glue is based on Lambdas and Lambdas have some limitations as far as running them continuously. Sometimes they get dropped, and they have to be reinitialized.
For how long have I used the solution?
I've been using the solution for about two years.
What do I think about the stability of the solution?
So far, the solution has been stable. For everything that we've done, we didn't run into any problems, so it was fairly stable for us.
What do I think about the scalability of the solution?
The solution can scale. I don't know exactly at what scale. I was using it for a medium-scale type solution. We never tested it with extremely large scales, so I don't know how expansive it can be. That said, since it's in the cloud, if you need more scale, it would be easy enough to add additional Glue components. At least, in theory, it should be fairly scalable.
Indirectly, we have hundreds of people on the solution.
At this time, I'm not sure if any clients plan to increase usage.
How are customer service and support?
Since the solution is so easy to use, we have not needed the help of technical support.
Which solution did I use previously and why did I switch?
We've previously used many different solutions, including Informatica, Data Storage from IBM, and SSIS.
How was the initial setup?
The solution is easy to set up. It's not overly complex.
At a minimum, a company would need one to two people to handle the deployment. If it is a larger scale of deployment, they might need more personnel.
What about the implementation team?
As a consultant, I assist with the implementation.
What was our ROI?
Likely, a company would receive an ROI as it's cheaper than the alternatives.
What's my experience with pricing, setup cost, and licensing?
I was not involved in the cost negotiation process.
Which other solutions did I evaluate?
Our clients typically have chosen Glue from the start. I was not involved in the evaluation process.
What other advice do I have?
We are using one of the latest versions of the solution. It's about two years old.
Depending on the number of data sources, the variety of data sources, and the variety of targets they will have, I might recommend the solution. What they have and plan to do will dictate whether Glue is a good solution or whether they would require something more sophisticated - such as Databricks. For example, if you have big data, then Databricks is probably a better solution to do ETL.
I'd rate the solution seven out of ten.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Associate Consultant at a tech vendor with 10,001+ employees
An extremely user-friendly and stable tool requiring an easy initial setup
Pros and Cons
- "The solution is highly user-friendly, and its features are easy to use. The new addition of AWS Glue Data Catalog is also very beneficial, making the tool even more helpful for its users."
- "The solution could be cheaper. The price of the solution is an area that needs improvement."
What is our primary use case?
Currently, we are utilizing AWS Glue for various ETL workloads, specifically in the life sciences domain. Our primary objective is to acquire data from various sources. Then, we store it in Redshift. This is where the complete use case of AWS Glue comes into the picture.
What is most valuable?
The solution is highly user-friendly, and its features are easy to use. The new addition of AWS Glue Data Catalog is also very beneficial, making the tool even more helpful for its users.
What needs improvement?
The solution could be cheaper. The price of the solution is an area that needs improvement.
For how long have I used the solution?
I have been using AWS Glue in my organization for a year. I am an end-user and a customer of the solution.
What do I think about the stability of the solution?
It is a stable solution. We have not faced any issues in the past year, so it's pretty stable. Stability-wise, I rate it a ten out of ten.
What do I think about the scalability of the solution?
The solution has proven to be scalable, and from my experience in the data engineering domain, I rate it an eight out of ten. It is worth noting that I may not be the most qualified person to provide a rating since I mostly manage and work on data-related tasks. Currently, approximately 20-25 people in our company use the solution.
How are customer service and support?
I had no experience with the technical support team of AWS Glue.
Which solution did I use previously and why did I switch?
Previously, I used Azure Data Factory. But I did not find it really helpful. And it was a bit complex. It was not that user-friendly. And I am much more comfortable with the AWS services as compared to Azure services.
How was the initial setup?
The initial setup of the solution is straightforward, and I find it easy to implement. I rate the setup process a nine on a scale of one to ten, where ten is the easiest. As for the deployment process, we usually request our platform team to handle it, and they are quite efficient in deploying and managing the infrastructure. Although I am not directly involved in the deployment process, my understanding is that it can be completed in just a few hours with the help of two to three team members. Our platform team consists of data engineers, architects, and platform engineers who cater to the needs of various projects and products within the AWS ecosystem. Fortunately, the solution does not require any maintenance.
What's my experience with pricing, setup cost, and licensing?
Price-wise, the solution is adequate, and we have no issues with it. We believe that the cost is justified given the number of users and the features it provides. Overall, it can be considered an average-priced tool. I would rate the solution a six or seven on a scale of one to ten, with ten being very expensive. Specifically, I rate its pricing a six out of ten.
Which other solutions did I evaluate?
Before choosing AWS Glue, I evaluated Azure Data Factory.
What other advice do I have?
I would tell those planning to use AWS Glue to try it. I rate the overall solution a ten out of ten.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
AWS DATA ENGINEER at Coforge Growth Agency
Intuitive with a good user interface and ETL integration capabilities
Pros and Cons
- "The two features I find most valuable in AWS Glue are its user interface and ease of use."
- "Beginners need additional support as it currently lacks some features required for complex transformations, often necessitating custom Python coding."
What is our primary use case?
I have been working as a data engineer, where dealing with the ETL process is essential. We are using AWS Glue as a primary ETL tool to serve our organization's needs. I have implemented several Glue jobs still in production.
How has it helped my organization?
AWS Glue has enabled us to perform ETL processes efficiently, with ease of use for AWS cloud users, providing a serverless service that eliminates the need for infrastructure maintenance.
What is most valuable?
The two features I find most valuable in AWS Glue are its user interface and ease of use. The user interface is intuitive, and navigating through the Glue console is seamless.
Additionally, its ability to integrate with other AWS services is excellent, providing flawless coordination with services such as SNS, S3, and Lambda.
What needs improvement?
I see scope for improvement in the drag-and-drop feature of AWS Glue. Beginners need additional support as it currently lacks some features required for complex transformations, often necessitating custom Python coding.
For how long have I used the solution?
I have been using Glue for more than five years now.
What do I think about the stability of the solution?
Overall, the stability of AWS Glue is excellent. I would rate it a nine out of ten. Some network-related issues may arise. That said, they are rare and do not affect its functionality significantly.
What do I think about the scalability of the solution?
Regarding scalability, AWS Glue is nearly perfect. I would rate it a nine out of ten, although there is always room for improvement.
How are customer service and support?
AWS customer service is great, but there is room for improvement. The issue I face is the inconsistency in dealing with different customer service representatives for the same issue, which disrupts personal touch.
How would you rate customer service and support?
Neutral
What's my experience with pricing, setup cost, and licensing?
On an organizational level, the pricing of AWS Glue does not pose a concern. It is in line with other ETL tools in the market. However, AWS Glue's cost to free-tier users is an issue because it is not entirely free, even for trial purposes.
What other advice do I have?
I advise potential users to adopt AWS Glue primarily due to its user-friendly interface, extensive documentation, and seamless integration with other AWS services, making it ideal for data engineers.
I'd rate the solution nine out of ten.
Disclosure: My company has a business relationship with this vendor other than being a customer:
Last updated: Oct 29, 2024
Flag as inappropriateData Engineer at Scania
Provides good scalability and has an easy setup process
Pros and Cons
- "The product has a valuable feature for data catalog."
- "The product is expensive for data streaming. This area needs improvement."
What is our primary use case?
We use AWS Glue for ETL batch processing purposes.
What is most valuable?
The product has a valuable feature for data catalog.
What needs improvement?
The product is expensive for data streaming compared to EMR. This area needs improvement.
For how long have I used the solution?
We have been using AWS Glue for one and a half years.
What do I think about the stability of the solution?
I rate the product's stability a ten out of ten.
What do I think about the scalability of the solution?
We have five to six AWS Glue users. I rate its scalability a nine out of ten.
Which solution did I use previously and why did I switch?
We have used Cloudera before. We switched to AWS Glue for better pricing, scalability, and innovation.
How was the initial setup?
The initial setup is easy. I rate the process an eight or nine out of ten. It could be deployed on-premises and on the cloud as well. We have a team of five executives to carry out the implementation.
What's my experience with pricing, setup cost, and licensing?
It is an expensive product. I rate its pricing a nine out of ten.
What other advice do I have?
I rate AWS Glue a nine out of ten.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Owner at a tech services company with 51-200 employees
Capable of handling real-time but ETL interface could be more user-friendly
Pros and Cons
- "I also like that you can add custom libraries like JAR files and use them. So, the ability to use a fast processing engine and embed basic jobs easily are significant advantages."
- "One area that could be improved is the ETL view. The drag-and-drop interface is not as user-friendly as some other ETL tools."
What is our primary use case?
One common use case is migrating data from one system to another. So, mostly migrating data and data engineering, getting real-time or near-real-time data using Lambda functions and migrating big data from on-prem to the cloud for historical data before starting a project.
What is most valuable?
If you have the Fund Manager, you could use a fast processing engine, which is crucial for performance.
I also like that you can add custom libraries like JAR files and use them. So, the ability to use a fast processing engine and embed basic jobs easily are significant advantages.
What needs improvement?
One area that could be improved is the ETL view. The drag-and-drop interface is not as user-friendly as some other ETL tools.
Additionally, AWS Glue can sometimes be slow, especially when processing large datasets. It was sometimes a bit slow. Also, I couldn't directly use bucketed data. With Elastic Glue, you had to convert your data frames into the correct format before connecting them using the drag-and-drop interface. So that's something I didn't like because the conversion process wasn't straightforward.
In future releases, I would like to see a feature that could trigger Glue pipeline using an API or something.
For how long have I used the solution?
I have experience with AWS Glue. I have about one year of experience in a professional setting, but I have also done some personal work with this solution.
How are customer service and support?
Support was good, but I was working with a big client, so that might have influenced the experience. The response time was fast, we heard back from them within a day.
How would you rate customer service and support?
Positive
How was the initial setup?
I would rate my experience with the initial setup an eight out of ten, where one is difficult and ten is easy.
The initial setup is not very complex. You can customize parameters like minimum and maximum for your needs. For me, it wasn't complex to deploy the solution.
What other advice do I have?
I'd rate it around six out of ten compared to other tools like Databricks.
Which deployment model are you using for this solution?
On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
CEO at Quark Technologies SRL
Highly scalable, reliable, and beneficial pay-as-you-go pricing model
Pros and Cons
- "AWS Glue is a good solution for developers, they have the ability to write code in different languages and other software."
- "The interface for AWS Glue could improve, they do not put a lot of details. You can write the code, in PySpark or in Scala, which is a big advantage, it is only easy to use for a developer. It will be difficult for new users to enter the cloud environment."
What is our primary use case?
My colleagues work with Spark, PySpark, and Scala as programming languages for writing complex aggregations. They have a repository in order to have a general view of all the sources and jobs on the platform and AWS Glue is very helpful.
What is most valuable?
AWS Glue is a good solution for developers, they have the ability to write code in different languages and other software.
What needs improvement?
The interface for AWS Glue could improve, they do not put a lot of details. You can write the code, in PySpark or in Scala, which is a big advantage, it is only easy to use for a developer. It will be difficult for new users to enter the cloud environment.
If business users want to run their own graphs they will not have the opportunity to use such features, such as running code inside AWS Glue in Spark, which will be complex for them.
For how long have I used the solution?
I have been using AWS Glue for approximately four years.
What do I think about the stability of the solution?
AWS Glue is a highly stable solution. We didn't have bugs in production.
The solution works well with Spark, which is a good framework for large volumes of data. It operates very well.
I rate the stability of AWS Glue a ten out of ten.
What do I think about the scalability of the solution?
The scalability of AWS Glue is great. It was used for enterprise customers. We worked a lot with AWS Glue for International companies.
We have approximately 10 people using AWS Glue in my company.
How are customer service and support?
I have to use the support from AWS Glue. The response time could improve.
I rate the support from AWS Glue a nine out of ten.
How would you rate customer service and support?
Positive
How was the initial setup?
The initial setup of AWS Glue is very simple.
What's my experience with pricing, setup cost, and licensing?
AWS Glue uses a pay-as-you-go approach which is helpful. The price of the overall solution is low and is a great advantage.
Which other solutions did I evaluate?
If I can compare AWS Glue to other solutions, it has the advantage of the cloud, which assures availability and scalability, and the pay-as-you-go is beneficial. This is why many companies are moving from their traditional ETL tools to the cloud because the costs will be reduced dramatically.
What other advice do I have?
I would recommend this solution to others.
I rate AWS Glue a nine out of ten.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
Buyer's Guide
Download our free AWS Glue Report and get advice and tips from experienced pros
sharing their opinions.
Updated: January 2025
Product Categories
Cloud Data IntegrationPopular Comparisons
Informatica Intelligent Data Management Cloud (IDMC)
MuleSoft Anypoint Platform
webMethods.io
AWS Database Migration Service
Palantir Foundry
Denodo
Fivetran
Matillion ETL
SnapLogic
Elastic Search
IBM App Connect
Zapier
IBM Cloud Pak for Integration
Talend Data integration
Jitterbit Harmony
Buyer's Guide
Download our free AWS Glue Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links
Learn More: Questions:
- Which is the best choice for cloud integration: AWS Glue or Informatica Intelligent Cloud Services (IICS)?
- Is AWS Glue a difficult solution to use if you are a complete beginner?
- Is AWS Glue effective for AWS-related products only?
- Why would you choose AWS Glue over other tools?
- What are the most common use cases for AWS Glue?
- How does Talend Open Studio compare with AWS Glue?
- Does AWS Glue offer more flexibility than other ETL (Extract, Transform, Load) tools in terms of data loading?
- Oracle ICS vs ODI
- When evaluating Cloud Data Integration, what aspect do you think is the most important to look for?
- What is data lake storage?