What is our primary use case?
I use it for log monitoring and log capture. That's the main purpose.
How has it helped my organization?
Alerts and notifications functions help with the operational response. For example, if you have a critical server and want to set an alert if CPU utilization exceeds 90%, a notification will be raised. You can link an action to that notification, not just notify.
You can automate actions or use AWS functionalities like auto-scaling, where you can configure the metrics to add more nodes if the threshold is exceeded. AWS has become very rich in functionality over time, and the same piece of functionality can solve different problems.
CloudWatch automation works through notifications linked to Lambda functions, which can then perform any automation you like.
What is most valuable?
CloudWatch has many things. It gives you metrics and logs. You can set up alerts and notifications for the metrics and do a lot of automation once you have the notifications.
You can also create dashboards, although it's an additional fee that AWS charges. We can create a different dashboard with Grafana kind of thing.
The use cases are different for different people, but three things everyone uses are:
- Metrics monitoring
- Log monitoring, and
- Alerting and notification.
What needs improvement?
CloudWatch is extremely helpful for automation and other things for experienced users, but it's not very intuitive. It doesn't have drag-and-drop functionality to trigger actions from notifications. It's very capable, but it requires expert knowledge to leverage that capability.
So, for new or basic users, it's a bit complex to understand how it works. Basic users can see what is happening in the console, but building automation and other things requires expert knowledge.
Improvement in log patterns and metrics:
There are a lot of things. For example, there are log patterns and metric patterns. But if you have to do anomaly detection, I think the latest version added that, but they didn't have it before. Users can build it; Users can stream the log, build log analytics, and do anomaly detection, but it is not very straightforward. It's a very powerful tool, but you need expert knowledge to do those things.
Region-specific service puts certain limits:
Also, CloudWatch is a very region-specific service. People struggle with that, for example, if you have an application distributed in different geographies like Europe, India, and the USA. There is no way to aggregate CloudWatch logs in one place for workloads in every region. You need a third-party tool to get a holistic view of your application. CloudWatch has the data, but you need advanced skills to aggregate it and create insights.
Suggestions for additional functionality for CloudWatch:
Cloudwatch needs better built-in analytics capability. CloudWatch has a lot of data, but it's difficult to gain insights unless you stream it to a third-party log analytics tool. If there were built-in capabilities to analyze the data, that would be great.
For how long have I used the solution?
I have been using Amazon CloudWatch for ten years. We do work on a lot of AWS services, like CloudWatch, SageMaker, and even their generative AI solution like Bedrock.
What do I think about the stability of the solution?
I would rate the stability a ten out of ten. There are no issues with stability.
What do I think about the scalability of the solution?
I would rate the scalability a ten out of ten.
There are around 50 end users, and all the ops people use CloudWatch.
How are customer service and support?
AWS provides multiple support levels, including basic, business, and enterprise support. Your experience can differ depending on your support level.
My company uses business support. Quality-wise, it's okay most of the time, but sometimes, you get people with less experience. Response time could be improved because, for production applications, you want immediate support. Sometimes, it can take up to an hour to get someone on chat support.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
Azure also has something like CloudWatch. All cloud vendors have something similar.
How was the initial setup?
The initial setup is very simple.
CloudWatch also has an API, so I can send logs to CloudWatch by calling the API. But natively, it only integrates with AWS services.
Integration capabilities:
AWS has built different frameworks, like the simple notification service (SNS). From AWS SNS, I can fire Lambda or integrate with the event bus, AWS EventBridge. From an integration point of view among the peers, AWS is probably one of the best that offers you the integration capability.
What's my experience with pricing, setup cost, and licensing?
I would rate the pricing in the middle. It is a five out of ten, with one being high price, and ten being low price.
What other advice do I have?
Everyone needs it. There is no other way to capture central logs, so it's for everyone, irrespective of the size of the company.
I would recommend it. Overall, I would rate it an eight out of ten.
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner