We used Datadog to capture the salvatory of our AWS fleet of around 1,200 servers.
Senior Director with 10,001+ employees
A good solution for infrastructure, but not for application-level monitoring
Pros and Cons
- "Datadog's ability to group and visualize the servers and the data makes it relatively easy for the root cause analysis."
- "Datadog lacks a deeper application-level insight. Their competitors had eclipsed them in offering ET functionality that was important to us. That's why we stopped using it and switched to New Relic. Datadog's price is also high."
What is our primary use case?
What is most valuable?
Datadog's ability to group and visualize the servers and the data makes it relatively easy for the root cause analysis.
What needs improvement?
Datadog lacks a deeper application-level insight. Their competitors had eclipsed them in offering ET functionality that was important to us. That's why we stopped using it and switched to New Relic.
Datadog's price is also high.
For how long have I used the solution?
I have been using Datadog for about three years.
Buyer's Guide
Datadog
December 2024
Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: December 2024.
825,399 professionals have used our research since 2012.
What do I think about the stability of the solution?
Stability really wasn't ever an issue. We didn't have any outages specific to Datadog where we couldn't get reports or insights to information. We were more concerned about the stability of our own systems and applications.
What do I think about the scalability of the solution?
There was no issue with scaling as such. It didn't scale well only from the cost perspective.
How are customer service and support?
Fortunately, because of the stability of the solution, we never had reasons to deal with technical support. Most of our interaction was with their product management, which was focused on the feature capability and ultimately pricing.
What's my experience with pricing, setup cost, and licensing?
It didn't scale well from the cost perspective. We had a custom package deal.
Which other solutions did I evaluate?
We switched from Datadog to New Relic because it offered ET functionality. Datadog was traditionally born out of monitoring infrastructure. Over the years, they have improved their ability to give you insights at the application layer and to be considered under APM. New Relic really started at the application layer and has worked its way down.
Ultimately, we were able to accept New Relic because coming from an operations team, infrastructure was more important. As our application became more complex, our application developers needed better insight. Because there is a significant overlap in the Venn diagram between Datadog and New Relic, we felt that the needs of the infrastructure team and the applications team could be met with New Relic and its expansion in providing a sort of lightweight security.
What other advice do I have?
Datadog started off at the infrastructure level, and New Relic started off at the application level. Both of them were expanding not only into each other's space but also into the SIM space.
There are a lot of options out there. For folks like me, it becomes a costly proposition because, at the end of the day, we're talking about logs, events that get pushed out. I have to push out some to Datadog and some to the security event manager. Then you start to think why can't you just push them to one place and let a product do that. That's where these products are trying to grow. They're not quite there yet because the SIM space is pretty mature. An enterprise like ours needs something fully focused and dedicated. Startups can live with New Relic that has a security capability or Datadog.
I would advise you to really understand the value that you're trying to go after. Make sure that you're not trying to solve all problems that you have from the observability perspective with Datadog because that will erode the value you get out of this solution.
Make sure that you are going to use Datadog for infrastructure, and it is going to be great. If you start adding other kinds of stuff to it, you'll probably start losing some of that value. Especially, if you want to go for application-level monitoring, you may be a bit disappointed.
I would rate this solution a six out of ten. I'm a very price-conscious kind of purchaser.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Principal. Performance Engineering at Invitation Homes
A go-to tool for analyzing, understanding, and investigating application performance
Pros and Cons
- "Log analytics give us a powerful mechanism for error tracking, research, and analysis."
- "Network device and performance monitoring could be improved, as we've faced some limitations in this area."
What is our primary use case?
The soluton is used for full stack enterprise performance monitoring for our primarily cloud-based stack on AWS. We have implemented monitoring coverage using RUM for critical apps and websites and utilize APM (integrated with RUM) for full stack traceability.
We use Datadog as our primary log repository for all apps and platforms, and the advanced log analytics enable accurate log-based monitoring/alerting and investigations.
Additionally, we some advanced RUM capabilities and metrics to track and optimize client-side user experience. We track SLO's for our critical apps and platforms using Datadog.
How has it helped my organization?
We now have full-stack observability, which allows us to better understand application behavior, quickly alert users about issues, and proactively manage application performance.
We've seen value by implementing observability coordinated across multiple applications, allowing us to track things like customer shopping and orders across multiple applications and services.
For critical application launches, we've built dashboards that can track user activity and confirm users are able to successfully utilize new features, tracking user activities in real-time in a war-room situation.
Datadog is our go-to tool for analyzing, understanding, and investigating application performance and behavior.
What is most valuable?
APM accurately tracks our service performance across our ecosystem. RUM gives us client-side performance and user experience visibility, and the rate of new features implemented in the Digital Experience area recently has been high. Log analytics give us a powerful mechanism for error tracking, research, and analysis.
Custom metrics that we've created allow us to track KPIs in real-time on dashboards. All of these have proven valuable in our organization. Additionally, Datadog product support teams are responsive and have provided timely support when needed.
What needs improvement?
Agent remote configuration should be provided/improved and streamlined, allowing for config changes/upgrades to be performed via the portal instead of at the host.
Cost tracking via the admin portal is a bit lacking, even though it has gotten better. I'm looking for usage trends (that drive cost) across time and better visibility or notifications about on-demand charges.
Network device and performance monitoring could be improved, as we've faced some limitations in this area.
The Datadog usage-based cost model, while giving us better transparency, is difficult to follow at times and is constantly evolving.
For how long have I used the solution?
I've used the solution for three years.
How are customer service and support?
Support has been responsive and helpful.
How would you rate customer service and support?
Positive
What's my experience with pricing, setup cost, and licensing?
Pricing is straightforward. That said, it's sometimes difficult to estimate usage volumes.
Which other solutions did I evaluate?
We evaluated Datadog and New Relic in detail and chose Datadog due to their straightforward and competitive pricing model, and their full coverage of monitoring features that we desired, and an easy-to-use UI.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Last updated: Oct 2, 2024
Flag as inappropriateBuyer's Guide
Datadog
December 2024
Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: December 2024.
825,399 professionals have used our research since 2012.
Technical Support Engineer at Cybage Software
Helps to set up alerts and thresholds to monitor real-time metrics
Pros and Cons
- "Integrating Datadog with other platforms has made our monitoring processes a bit easier. It's not super simple, but it's manageable."
- "For three to four months, we have been experiencing real-time delays. For example, if we're monitoring incoming traffic, the real-time status should be displayed up to a certain point. However, due to delays or issues with Datadog, the real-time data might only be updated at an earlier time. We are experiencing consistent delays in data updates from Datadog, with the most recent data often being delayed by about an hour. This issue has been ongoing for the past four months."
What is our primary use case?
Datadog is mainly used to set up alerts and thresholds to monitor real-time metrics and checks.
What is most valuable?
Integrating Datadog with other platforms has made our monitoring processes a bit easier. It's not super simple, but it's manageable.
What needs improvement?
For three to four months, we have been experiencing real-time delays. For example, if we're monitoring incoming traffic, the real-time status should be displayed up to a certain point. However, due to delays or issues with Datadog, the real-time data might only be updated at an earlier time. We are experiencing consistent delays in data updates from Datadog, with the most recent data often being delayed by about an hour. This issue has been ongoing for the past four months.
For how long have I used the solution?
I have been using the product for a year.
What do I think about the scalability of the solution?
My company has 50 users for Datadog.
How was the initial setup?
The tool's deployment is difficult and time-consuming.
What's my experience with pricing, setup cost, and licensing?
The tool is open-source.
What other advice do I have?
If you're thinking about using Datadog for the first time, I suggest getting some basic training in data operations. It'll help you navigate Datadog more easily.
Learning it for the first time is not overly difficult, but it's also not very easy.
I would rate the tool a seven out of ten. While it's a useful tool, we've experienced some issues that haven't been resolved yet. Additionally, setting up dashboards and utilizing all the features requires some training.
Which deployment model are you using for this solution?
On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Last updated: Apr 19, 2024
Flag as inappropriateSystem Engineer at Raymond James
A stable and scalable infrastructure monitoring solution
Pros and Cons
- "Datadog has flexibility."
- "The product needs to have more enterprise approach to configuration."
What is most valuable?
Datadog has flexibility.
What needs improvement?
The product needs to have more enterprise approach to configuration.
For how long have I used the solution?
We use the tool to monitor our whole infrastructure. CPU, memory, and disk space are the types of things we use it for.
What do I think about the stability of the solution?
It is a stable solution.
What do I think about the scalability of the solution?
It is a scalable solution.
How are customer service and support?
The technical support team is good and responsive.
How would you rate customer service and support?
Positive
How was the initial setup?
The initial setup is not very easy and the deployment took eight months.It took quite a few teams to get it all accomplished. I rate it a six out of ten.
What other advice do I have?
I rate the solution eight out of ten.
Which deployment model are you using for this solution?
On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Architect at a comms service provider with 10,001+ employees
Good for monitoring and following metrics with a helpful flame graph
Pros and Cons
- "Flame graphs are pretty useful for understanding how GraphQL resolves our federated queries when it comes to identifying slow points in our requests. In our microservice environment with 170 services."
- "I often have issues with the UI in my browser."
What is our primary use case?
We use the solution primarily for distributed tracing, service insight and observability, metrics, and monitoring. We create custom metrics from outbound service calls to trace the availability of back-office systems.
We use the flame graph to get insights into our GraphQL implementation. It helps highlight how resolvers work.
However, it's lacking in tracing which GraphQL queries are run, and we use custom spans for that.
How has it helped my organization?
Prior, the team only had Instana, and few people used it. The main barriers to entry were the access (since it was not integrated into our SSO) and the user experience, which made it hard to follow. We had an on-prem version, and it wasn't the snappiest. The APM has made observability and tracing more accessible to developers.
What is most valuable?
Flame graphs are pretty useful for understanding how GraphQL resolves our federated queries when it comes to identifying slow points in our requests. In our microservice environment with 170 services. There are complex transactions over the course of a single user request since we essentially operate as a middle layer with 90 back office systems we integrate to.
What needs improvement?
I often have issues with the UI in my browser. I tend to have a lot of tabs open, yet have issues with it not responding or not showing data. A couple of times, pasting the URL into an incognito window shows the data that's there.
For how long have I used the solution?
I've used the solution for two years.
How was the initial setup?
The initial setup was complex and required a bit of tweaking to get everything configured correctly and into our pipelines.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
ITOPS and SRE Manager at Ticket
Good observability, available on the cloud, and capable of scaling
Pros and Cons
- "The observability on offer is the most useful aspect of the product."
- "The FinOps needs improvement."
What is our primary use case?
We primarily use the solution for observability.
How has it helped my organization?
The solution has helped with our POV phase.
What is most valuable?
The observability on offer is the most useful aspect of the product.
What needs improvement?
The FinOps needs improvement.
What do I think about the stability of the solution?
The stability is good.
What do I think about the scalability of the solution?
The scalability is good.
Which solution did I use previously and why did I switch?
We previously used AppDynamics and Dynatrace.
Which other solutions did I evaluate?
We also evaluated AppDynamics and Dynatrace.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Operations Manager at TodayTix
Good dashboards, easy troubleshooting, and integrations
Pros and Cons
- "The dashboards are super convenient to us for a more zoomed out view of what is going on with each integration that we utilize."
- "There could be more easily identifiable documentation on how to find different things on the platform."
What is our primary use case?
We utilize Datadog mainly to monitor our API integrations and all of the inventory that comes in from our API partners. Each event has its own ID, so we can trace all activity related to each event and troubleshoot where needed.
How has it helped my organization?
Datadog gives non-dev teams insights as to what all is happening with a particular event as well as flags any errors so that we can troubleshoot more efficiently.
What is most valuable?
The dashboards are super convenient to us for a more zoomed out view of what is going on with each integration that we utilize.
What needs improvement?
There could be more easily identifiable documentation on how to find different things on the platform. It can be overwhelming at first glance, and it's hard to find appropriate documentation on the site to lead you to where you need to be.
For how long have I used the solution?
I've used the solution for about 1.5 years.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Last updated: Sep 30, 2024
Flag as inappropriateSoftware Engineering Manager at a hospitality company with 1,001-5,000 employees
Easy to implement with great passive and active monitoring
Pros and Cons
- "It is easy to implement and scale applications with standardized visibility, monitoring and alerting"
- "Datadog is so feature-rich that it is often hard to onboard new folks and tough to decide where to invest time."
What is our primary use case?
We primarily use the solution for application monitoring (APM, logs, metrics, alerts).
It's useful for active monitoring (static monitors, threshold monitors). We get a lot of value out of anomaly detection as well. SLOs and monitoring of SLOs have been another value add.
In terms of metrics, the out-of-the-box infrastructure metrics that come with the Datadog agent installation are great. We have made use of both the custom metrics implementation as well as the log-based metrics which are extremely convenient.
We also leverage Datadog for use of RUM and want to explore session replay.
How has it helped my organization?
It is easy to implement and scale applications with standardized visibility, monitoring and alerting
We get a lot of value out of passive and active monitoring. While different teams across our organization have used different services (metrics, logs, APM, RUM), almost all teams have been able to use the dashboards to report and track high-level metrics and active monitoring.
Active monitoring (static monitors, threshold monitors) is great. We get a lot of value out of anomaly detection as well. SLOs and monitoring of SLOs have been another value add for our organization.
What is most valuable?
The APM and tracing provide visibility and the ability to get right to root cause issues while being able to deploy new services without much need for custom instrumentation quickly
The active monitoring (static monitors, threshold monitors) has been very helpful. We get a lot of value out of anomaly detection. SLOs and monitoring of SLOs have been extremely valuable.
The metrics and out-of-the-box infrastructure metrics that come with the Datadog agent installation are quite helpful to the organization. We have made use of both the custom metric implementation as well as the log-based metrics which are extremely convenient.
What needs improvement?
Datadog is so feature-rich that it is often hard to onboard new folks and tough to decide where to invest time.
The APM is a perfect example of this. This feature alone has so much (profiling, tracing, span summary, flame graphs). I would love to see more of the insight and automation-focused features, such as the log patterns, where I can spend time more efficiently.
The cost of Datadog at scale can get very expensive very quickly. I would like to see a better usage/cost dashboard with breakdowns like the AWS cost explorer.
For how long have I used the solution?
I've used the solution for three years.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Buyer's Guide
Download our free Datadog Report and get advice and tips from experienced pros
sharing their opinions.
Updated: December 2024
Product Categories
Cloud Monitoring Software Application Performance Monitoring (APM) and Observability Network Monitoring Software IT Infrastructure Monitoring Log Management Container Monitoring AIOps Cloud Security Posture Management (CSPM)Popular Comparisons
Zabbix
New Relic
Azure Monitor
Elastic Observability
SolarWinds NPM
PRTG Network Monitor
ThousandEyes
Nagios XI
LogicMonitor
Centreon
Auvik Network Management (ANM)
ScienceLogic
Icinga
Checkmk
BMC TrueSight Operations Management
Buyer's Guide
Download our free Datadog Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links
Learn More: Questions:
- Datadog vs ELK: which one is good in terms of performance, cost and efficiency?
- Any advice about APM solutions?
- Which would you choose - Datadog or Dynatrace?
- What is the biggest difference between Datadog and New Relic APM?
- Which monitoring solution is better - New Relic or Datadog?
- Do you recommend Datadog? Why or why not?
- How is Datadog's pricing? Is it worth the price?
- Anyone switching from SolarWinds NPM? What is a good alternative and why?
- Datadog vs ELK: which one is good in terms of performance, cost and efficiency?
- What cloud monitoring software did you choose and why?