We primarily use the solution for charting application metrics.
We use it for all our application metrics, host metrics, and monitors with a PagerDuty integration.
We integrate our application logs. It is great to be able to tie our metrics and our traces together.
We use the APM module with traces. It is great to be able to link APM, logs, and metrics in one go, as it shortens our troubleshooting and RCA dramatically.
We are loving the tool; it is great to have all those insights in one place.
We hope that they keep making my life and our engineers' life easier.
The solution improved our organization with:
- Data-driven decision making
- Dashboards we can share with our customer success team
- Dashboards we can share with our sales engineers
- Help during incidents
- Help with preventing incidents
- Integration with PagerDuty.
The most valuable aspects of the solution include:
- The charting application metrics
- help with the business, prioritization, software design, and infrastructure design.
The pricing model hurts and forces us to work around the tool sometimes.
On top of application performance metrics, it would be great to have host performance metrics, suggesting changes to better use a cluster like: "You are over-provisioning this host" or "based on historical data, you will need to scale up in X days."
Adding a module to extract data from Datadog so we can use the data in our own system would be helpful.
I've used the solution for six or more years.
We previously used New Relic, which was a great tool. That said, Datadog is a more complete solution.
The pricing should be less of a surprise. They should allow us to cap costs which would lead to less frustration.
We need better documentation on the pricing.
It might be helpful if they added a pricing simulator.