What is our primary use case?
We use Datadog as our main monitoring platform across all environments, including production, staging, and development.
It plays a crucial role in monitoring infrastructure performance, aggregating logs, and running synthetic and browser tests. Datadog helps us track critical system metrics like CPU, memory, and network traffic, allowing us to detect issues in real-time.
Its log management and alerting features enable quick incident response, while synthetic monitoring ensures optimal user experience and service uptime by proactively identifying performance issues. We rely on browser tests to simulate real-world user interactions, ensuring that key workflows in our applications perform smoothly.
Overall, Datadog allows us to maintain a high level of reliability and performance across our systems.
How has it helped my organization?
Datadog has significantly improved our ability to consolidate information into one central platform. Before implementing Datadog, our data and monitoring metrics were scattered across various systems and tools, making it difficult to get a unified view of our infrastructure and application health. This fragmented approach often led to inefficiencies, as we had to switch between different systems to gather relevant information, delaying our response to incidents.
With Datadog, all our critical data—logs, metrics, and monitoring—is now integrated in one place, allowing us to easily correlate events, analyze performance, and quickly diagnose issues, greatly improving both operational efficiency and incident management.
What is most valuable?
The most valuable features we’ve found in Datadog are logging, API monitoring, infrastructure monitoring, and browser tests.
Logging allows us to collect and centralize logs from across all our services, making it easier to troubleshoot issues and gain insights into application performance.
API monitoring is crucial for ensuring the reliability and performance of our API endpoints, allowing us to detect issues proactively.
Infrastructure monitoring gives us real-time visibility into our servers, containers, and cloud resources, helping us optimize performance and reduce downtime.
Lastly, browser tests simulate real user interactions, ensuring that our web applications deliver a seamless experience by detecting any potential performance or functionality issues before they impact users.
Together, these features provide a comprehensive monitoring solution, making Datadog an essential tool for maintaining system reliability and performance.
What needs improvement?
One area where Datadog could be improved is its pricing structure, which can sometimes make it cost-prohibitive to adopt new features. As we continue to scale, the costs associated with enabling more advanced monitoring capabilities, like additional integrations or more detailed data retention, can add up quickly. This makes it challenging for teams to justify the expense, especially when trying to utilize new features that could enhance monitoring and performance analysis.
Another improvement would be better cost transparency within the product’s GUI. Currently, it can be difficult to track how specific features or services are contributing to overall costs. If Datadog could provide more detailed, real-time insights into pricing directly within the interface—such as breakdowns of how much each feature or integration costs—it would help users manage budgets more effectively and avoid unexpected charges. A built-in budgeting tool or cost alerting system could also be useful, allowing organizations to make more informed decisions about what features to activate without the fear of overextending their budget.
Adding these features would give customers a clearer understanding of how to optimize their usage without overspending, making the platform even more accessible for teams that are cost-conscious but still want to take advantage of the full range of Datadog’s powerful capabilities.
For how long have I used the solution?
I've used the solution for five years.
What do I think about the stability of the solution?
The solution is very stable.
What do I think about the scalability of the solution?
It's very scalable. It can handle pretty much anything you throw at it.
How are customer service and support?
Overall, support is very good and they have a responsive support team.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
We previously had a range of open-source and in-house built tools. We switched to get everything in one place.
How was the initial setup?
It was easy to understand and to implement. Datadog offers great documentation.
What about the implementation team?
We implemented the solution in-house.
What's my experience with pricing, setup cost, and licensing?
Beware of costs when using the platform. Set up alerting for unusual log volumes and set up rate limiting when possible.
Which other solutions did I evaluate?
What other advice do I have?
It's a great product, although a bit expensive.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Google
Disclosure: I am a real user, and this review is based on my own experience and opinions.