What is our primary use case?
We're using GPU 0.2 in ten verticals and wanted to use AWS Glue only for one purpose: to optimize Amazon Redshift.
We have millions of data that we have to back up. Previously, we did it once every six months, but the client data have been very interactive, and we need spontaneous back and forth of data communication in real-time. In one second, we have almost one million records that come and go continuously. The client wanted to keep all data because they're using it for analytics and wanted to back up the data every second without delay. We tried to optimize Amazon Redshift and found out about AWS Glue, which comes with massive costs, but the client is willing to pay.
What is most valuable?
What I like best about AWS Glue is its real-time data backup feature. Last week, there was a production push, and what used to take almost ten days even to send out around fifty-six thousand emails now takes only two hours.
I also like that the data backup in AWS Glue is spontaneous, and data is recorded and backed up every single moment.
What needs improvement?
AWS Glue had some issues, which required optimization, particularly in terms of the number of workers you deploy, and that's where costing comes in. Cost-wise, AWS Glue is expensive, so that's an area for improvement. My company did some modifications, which turned out to be successful, so overall, the solution works fine.
Even though there is a backup, you need to know what's happening. You need to understand why there's a failure. AWS Glue doesn't provide the information, so my company uses its logs. The development team also doesn't have specific answers because the team is still playing around with the process, which means the company is still trying to figure out other areas for improvement in AWS Glue.
The process for setting up the solution was also complex, which is another area for improvement.
AWS should provide help during migration and assist its users. Otherwise, it's a nightmare.
For how long have I used the solution?
I've been using AWS Glue for one and a half months.
What do I think about the stability of the solution?
AWS Glue is stable, but stability depends on how many workers you deploy and the work that you do.
What do I think about the scalability of the solution?
AWS Glue is highly scalable. It can scale to almost one billion data per second.
How are customer service and support?
We did make some good friends in AWS, so they gave us technical support for AWS Glue for free. They were also new and were trying to evolve, so they provided us with free support, but they'll be charging other clients for the support moving forward.
How was the initial setup?
The setup for AWS Glue is highly complex. The company started with R&D four months ago and only completed the deployment last week.
My company used one and a half FTE resources for the deployment.
The deployment process for AWS Glue was normal and involved CI/CD, but it was mainly the backend dev ops engineers who did it. I'm more of a project manager, so I'm not involved in technical items. It's more of me helping the engineers with the R&D.
What's my experience with pricing, setup cost, and licensing?
AWS Glue is a high-priced solution that bills the client $150,000 to $250,000 annually. That's just the starting price because it's a small data sample, but if it hits over three hundred million users, the cost will probably go up almost thirty times more.
What other advice do I have?
I'm using the latest version of AWS Glue.
I'm not the end-user, as I work for a company that implements AWS Glue for clients.
My company has one client using AWS Glue, but that client has three hundred million users.
I recommend AWS Glue to others because it's an excellent solution. However, it lacks documentation. There's only a little documentation available. Even certified AWS practitioners struggle with the lack of documentation for AWS Glue. You'll find complicated processes or features, such as time series tables. Even if there's documentation, implementing the solution requires many trial and error methods, and revamping becomes a nightmare if you're using the old infrastructure.
My rating for AWS Glue is seven out of ten because of the complexity of the deployment, and the lack of information and documentation, that my company had to do some R&D. If AWS had complete documentation, or sent more than one person to assist my company, then it could have saved more time.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Disclosure: My company has a business relationship with this vendor other than being a customer: Implementer