Try our new research platform with insights from 80,000+ expert users
Reviewer 76 - PeerSpot reviewer
Vice President of SaaS Infrastructure at a tech services company with 51-200 employees
User
Enhances efficiency with robust alerting and visualization tools
Pros and Cons
  • "The real-time data helps us make informed decisions and optimize our operations, ultimately enhancing our overall efficiency and performance."
  • "The pricing model can be quite complex and might benefit from more flexible options tailored to different organizational needs."

What is our primary use case?

Our primary use case for Datadog is to monitor and manage our fully cloud-native infrastructure. We utilize DataDog to gain real-time visibility into our cloud environments, ensuring that all our services are running smoothly and efficiently. 

The platform’s extensive integration capabilities allow us to seamlessly track performance metrics across various cloud services, containers, and microservices. 

With Datadog’s robust alerting and visualization tools, we can proactively identify and resolve issues, minimizing downtime and optimizing our system’s performance. This has been crucial in maintaining the reliability and scalability of our cloud-native applications.

How has it helped my organization?

Datadog has significantly enhanced our organization’s operational efficiency and reliability. By providing real-time visibility into our cloud-native infrastructure, Datadog enables us to monitor performance metrics, detect anomalies, and resolve issues swiftly. 

The platform’s robust alerting system ensures that potential problems are addressed before they impact our services, reducing downtime and improving overall system stability. Additionally, Datadog’s comprehensive dashboards and reporting tools have streamlined our troubleshooting processes and facilitated better decision-making.

What is most valuable?

The most valuable feature of Datadog for our organization has been its real-time monitoring capabilities. This feature provides us with instant visibility into our cloud-native infrastructure, allowing us to track performance metrics and detect anomalies as they occur. The ability to monitor our systems in real-time means we can quickly identify and address issues before they escalate, minimizing downtime and ensuring the reliability of our services. 

Additionally, the real-time data helps us make informed decisions and optimize our operations, ultimately enhancing our overall efficiency and performance.

What needs improvement?

While Datadog has been instrumental in enhancing our operational efficiency, there are areas where it could be improved. 

One area is the user interface, which could be more intuitive and user-friendly, especially for new users. 

Additionally, the pricing model can be quite complex and might benefit from more flexible options tailored to different organizational needs. 

For future releases, it would be beneficial to include more advanced machine learning capabilities for predictive analytics, helping us anticipate issues before they occur. 

More third-party tools would also be valuable additions.

Buyer's Guide
Datadog
March 2025
Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: March 2025.
839,422 professionals have used our research since 2012.

For how long have I used the solution?

I've used the solution for six years.

What do I think about the stability of the solution?

DataDog has proven to be a highly stable solution for our monitoring needs. Throughout our usage, we have experienced minimal downtime and consistent performance, even during peak traffic periods. The platform’s reliability ensures that we can continuously monitor our cloud-native infrastructure without interruptions, which is crucial for maintaining the health and performance of our services.

What do I think about the scalability of the solution?

DataDog’s scalability has been impressive and instrumental in supporting our growing cloud-native infrastructure. The platform effortlessly handles increased workloads and scales alongside our expanding services without compromising performance. Its ability to integrate with a wide range of cloud services and technologies ensures that as we grow, DataDog continues to provide comprehensive monitoring and insights.

How are customer service and support?

Our experience with Datadog’s customer service and support has been exceptional. The support team is highly responsive and knowledgeable, providing timely assistance whenever we’ve encountered issues or had questions. 

Their proactive approach to offering solutions and guidance has been invaluable in helping us maximize the platform’s capabilities.

How would you rate customer service and support?

Positive

How was the initial setup?

The setup is straightforward.

What about the implementation team?

We handled the setup in-house.

What's my experience with pricing, setup cost, and licensing?

The pricing model can be quite complex and might benefit from more flexible options tailored to different organizational needs.

What other advice do I have?

One area is the user interface, which could be more intuitive and user-friendly, especially for new users.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
Hoon Kang - PeerSpot reviewer
Full Stack Engineer at K HEALTH, INC
User
Top 20
Good alerting and issue detection for many valuable features
Pros and Cons
  • "Thanks to frequent concurrent deployments, the DataDog alerts monitors allow us quickly detect issues if anything occurs."
  • "The monitors can be improved."

What is our primary use case?

Our company has a microservice architecture, with different teams in charge of different services. Also, it is a start, which means that we have to build fast and move very fast as well. So before we were properly using DD, we often had issues of things breaking, but without much information on where in our system the breaking happened. This was quite a big-time sync as teams were unfamiliar with other teams' codes, so they needed the help of other teams to debug. This slowed our building down a lot. So implementing dd traces fixed this

What is most valuable?

DataDog has many features, but the most valuable have become our primary uses.

Also, thanks to frequent concurrent deployments, the DataDog alerts monitors allow us quickly detect issues if anything occurs.

What needs improvement?

The monitors can be improved. The chart in the monitors only goes back a couple of hours, clunky. Also, it can provide more info, like traces within the monitors. We have many alerts connected to different notification systems, such as Slack and Opsgenie. 

When the on-caller receives notifications fired by the alerts, we are taken to the monitors. Yet often, we have to open up many different tabs to see logs, traces and info that is not accessible on the monitors. I think it would make all of the on callers' lives easier if the monitor had more data

For how long have I used the solution?

We've used the solution for three years.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
Buyer's Guide
Datadog
March 2025
Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: March 2025.
839,422 professionals have used our research since 2012.
reviewer2002326 - PeerSpot reviewer
API Developer at a tech services company with 501-1,000 employees
Real User
Good monitoring, logging, and alert features
Pros and Cons
  • "Thanks to the logs, we manage to make better reports through Jira and also to trace the request with more facility than we would be able to do otherwise."
  • "When the logs are too big, and Datadog splits them, the JSON format breaks and it is not so useful for us."

What is our primary use case?

We use the solution for monitoring, logging, and alerts. 

Thanks to Datadog, we report errors using the logger integrated into our services, which is crucial since we only do unit tests. The infrastructure team handles the monitoring part, so I can't give more insights about that. I am an API developer, so I use Datadog mainly for logging.

The alerts are connected to Microsoft Teams in a specific channel, and we pay a lot of attention to it, and we usually create tickets based on these alerts.

How has it helped my organization?

Thanks to the logs, we manage to make better reports through Jira and also to trace the request with more facility than we would be able to do otherwise. 

Since there are many teams in my company, the fact that we can share the trace of an error, for example, together with all the information about the log, we are able to save a lot of time when it comes to communication between everyone.

What is most valuable?

The most valuable feature for me so far is logging. We do not do integration tests, so we rely a lot on tracing all the requests and we report errors to different teams in the company together with logs that we take from Datadog.

Since I am an API developer, I do not use so much with the other features. Also, I have been in the company for only four months. I have only worked with monitors and alters.

I value tracing the request and being able to tell other teams which component, service, or line of code has an issue.

What needs improvement?

Since I have only been in the organization for four months, I only worked with the log, alerts, and monitoring. I do not have so many insights to share about what can be improved.

I am not an expert user, and not even an intermediate user yet. Rather, I am a beginner.

That said when the logs are too big, and Datadog splits them, the JSON format breaks and it is not so useful for us.

For how long have I used the solution?

I've used the solution for four months.

Which solution did I use previously and why did I switch?

I did not previously use a different solution.

What's my experience with pricing, setup cost, and licensing?

I will get informed about this, I have no idea about costs as an API developer. But I get curious about it

Which other solutions did I evaluate?

I did not evaluate other options previously.

Which deployment model are you using for this solution?

Hybrid Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Google
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
reviewer1318287 - PeerSpot reviewer
IT Test Manager at a transportation company with 10,001+ employees
Real User
Very good documentation provided along with regular new features
Pros and Cons
  • "Datadog is constantly adding new features."
  • "Lacks some flexibility in the customization."

What is our primary use case?

Our primary use case is log management and we also use the solution for monitoring the application and underlying infrastructure. I'm an IT test manager. 

What is most valuable?

I appreciate that they are constantly adding new features, some of which we haven't yet had a chance to implement. 

What needs improvement?

I'd like to see more flexibility in the customization and they have a few settings which need to be changed but we are unable to make those changes as users or as the administrator. The tagging to get the different parts of the monitoring interconnected is a bit tricky and takes time to work out. 

For how long have I used the solution?

I've been using this solution for 18 months. 

What do I think about the stability of the solution?

The stability is good. 

What do I think about the scalability of the solution?

I would say that the amount that we are monitoring is not that large and we've never had any scalability issues. We have around 50 users in our department. 

How are customer service and support?

The availability or accessibility to customer service is not always good, although they generally provide solutions once you do manage to get hold of them. 

Which solution did I use previously and why did I switch?

We have previously used different tools for different parts of the monitoring. We changed to AWS when we moved to the cloud. We also found that the effort in maintaining Grafana and Prometheus and keeping it up to date was taking too much time.

How was the initial setup?

The initial setup was straightforward, we used a service provider and they also maintain our operation in general.

What's my experience with pricing, setup cost, and licensing?

We have a four-year contract with Datadog, and the solution is pay-as-you-use. 

What other advice do I have?

I would suggest using the documentation, which is quite good. It's best to start with existing integrations, and then do the customization step-by-step.

I rate this solution eight out of 10. 

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Software Engineer at Sony Corporation of America
Real User
It is very easy to use and configure. It has a nice UI.
Pros and Cons
  • "If we have a large load for users using our basic Datadog, it will immediately fire off an alert notifying us either something's wrong or not."
  • "It has a nice UI."
  • "We have asked technical support questions, and sometimes they don't get back to us right away. Or when they do, it is not the right answer."

What is our primary use case?

If our app is up and running, we use it to monitor how many credits the app is using up on each node. We also monitor services by how long each call is taking with the help of EC2s off of application.

How has it helped my organization?

If we have a large load for users using our basic Datadog, it will immediately fire off an alert notifying us either something's wrong or not. It provides us insights on our calls to other services, such as how long each call is taking and what is the whole stack trace.

What is most valuable?

  • It is very easy to use.
  • It is easy to configure.
  • It has a nice UI.
  • Datadog provides everything that we need.

For how long have I used the solution?

One to three years.

What do I think about the stability of the solution?

Stability is great. It has not come down. It is always up.

We do not put a lot of stress on it. It use for monitoring our app, and it's a pretty great product.

What do I think about the scalability of the solution?

We have an application in AWS running four nodes. It is not too large. Our user base is about 2000 users.

How are customer service and technical support?

We have asked technical support questions, and sometimes they don't get back to us right away. Or when they do, it is not the right answer. 

Which solution did I use previously and why did I switch?

Before Datadog, we had APM monitoring, which is something similar, but it wasn't as nice to use or as easy to configure.

How was the initial setup?

It is easy to configure. You load the Datadog agent into the EC2 instance, then you just follow it. 

Which other solutions did I evaluate?

I did not participate in the evaluation of the other products.

What other advice do I have?

If you are monitoring the metrics and insights in your application, and need help monitoring, then this is a great application to look into. The app is always available. It has a clean UI and provides the metrics that you will need. It is a good product.

Right now, we only using it on this one application.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
reviewer0962486 - PeerSpot reviewer
Head of Product Design at hackajob
User
Good alerts and detailed data but needs UI improvements
Pros and Cons
  • "Session recordings have been the most valuable to me as it helps me gain insights into user behaviour at scale."
  • "In terms of UI, everything is very small, which makes it quite difficult to navigate at times."

What is our primary use case?

I work in product design, and although we use Datadog for monitoring, etc, my use case is different as I mostly review and watch session recordings from users to gain insight into user feedback.

We watch multiple sessions per week to understand how users are using our product. From this data, we are able to hone in on specific problems that come up during the sessions. We then reach out to specific users to follow up with them via moderated testing sessions, which is very valuable for us.

How has it helped my organization?

Using Datadog has allowed us to review detailed interactions of users at a scale that leads us to make informed data-driven UX improvements as mentioned above.

Being able to pinpoint specific users via filtering is also very useful as it means when we have direct feedback from a specific user, we can follow up by watching their session back. 

The engineering team's use case for Datadog is for alerting, which is also very useful for us as it gives us visibility of how stable our platform is in various different lenses.

What is most valuable?

Session recordings have been the most valuable to me as it helps me gain insights into user behaviour at scale. By capturing real-time interactions, such as clicks, scrolls, and navigation paths, we can identify patterns and trends across a large user base. This helps us pinpoint usability issues, optimize the user experience, and improve the overall experience for our users. Analyzing these recordings enables us to make data-driven decisions that enhance both functionality and user satisfaction.

What needs improvement?

I'd like the ability to see more in-depth actions on user sessions, such as where there are specific problems and rather than having to watch numerous session recordings to understand where this happens to get alerts/notifications of specific areas that users are struggling with - such as rage clicks, etc.

In terms of UI, everything is very small, which makes it quite difficult to navigate at times, especially in terms of accessibility, so I'd love for there to be more attention on this.

For how long have I used the solution?

I've used the solution for over one year.

Which solution did I use previously and why did I switch?

We did not evaluate other options. 

What's my experience with pricing, setup cost, and licensing?

I wasn't part of the decision-making process during licensing.

Which other solutions did I evaluate?

I wasn't part of the decision-making process during the evaluation stage.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
reviewer2002896 - PeerSpot reviewer
VP at a financial services firm with 10,001+ employees
Real User
Good monitoring, dashboards, and flame graphs
Pros and Cons
  • "The most valuable aspect is the APM which can monitor the metrics and latencies."
  • "The correlation between the logs and the metrics needs improvement as most cases, we might use another logging tool (that is cheaper in cost) which we then have to link together."

What is our primary use case?

The product is used for APM solutions for the metrics and traces for the REST API requests and service maps to understand the upstream and downstream services.

We are creating dashboards and widgets to monitor the status. We are creating alerts and monitors as well. We integrated the alerts and ticketing system in our organization with SNOW and Netcool.

We are using Kubernetes, AWS, and infrastructure metrics. We are using Kafka and Aurora Postgres logs as well, and we are using HTTP status codes to identify the error types.

How has it helped my organization?

So far, the solution works very well and solves most of the problems we have. Currently, we are trying to integrate the trace ID into Datadog and correlate the logs and metrics. However, Datadog is not supporting the spring-generated trace IDs, and they are not shown in the Datadog UI. It works in reverse. This means Datadog injects the DD-specific trace ID into the application logs, and those logs can be in other tools, for example, Cloud Watch and Splunk. 

What is most valuable?

The most valuable aspect is the APM which can monitor the metrics and latencies. There's a low error rate, and any alerts can be tagged to the service requests and sent via email to the required DLs. 

We can create incidents as well in our internal tools, like SNOW and Netcool.

The monitoring enables different dimensions of metrics to monitor the services and infrastructure. 

We have cloud infrastructure monitoring in Kubernetes nodes, pods containers, and ingress metrics.

Alerts are sent to an email in case of any issues. The metrics are used to create alerts.

The solution offers good dashboards, service maps, traces and flame graphs, HTTP status codes, power packs, service catalogs, and profiling.

While the logs module is not activated, we are using all other modules.

What needs improvement?

The correlation between the logs and the metrics needs improvement as most cases, we might use another logging tool (that is cheaper in cost) which then we have to link together. 

They can improve the SSO logging as well. Currently, we are logging in every two to three days by sending the login link explicitly.

For how long have I used the solution?

I've been using the solution for two years. 

What do I think about the stability of the solution?

The stability is awesome. 

What do I think about the scalability of the solution?

We are expanding beyond observability right now.

How are customer service and support?

They offer pretty awesome customer support.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We did not previously use a different solution.

How was the initial setup?

The initial setup was easy.

What about the implementation team?

We implemented the solution with the help of a vendor team.

What was our ROI?

I'd rate the ROI ten out of ten.

What's my experience with pricing, setup cost, and licensing?

I would recommend Datadog to others.

Which other solutions did I evaluate?

We also evaluated ECE and Splunk.

What other advice do I have?

The solution has a great support model.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
reviewer2000448 - PeerSpot reviewer
Senior Manager at a manufacturing company with 10,001+ employees
Real User
Great network monitoring, testing, and integration tools
Pros and Cons
  • "The visibility into our network has allowed for quick diagnosis of failures, identification of underutilized or over-utilized resources, and allowed for cloud cost optimization opportunities."
  • "I would love to see more metrics or analytics in IoT devices."

What is our primary use case?

This solution is for physical device monitoring across breweries, including PLCs, HMI Cameras, RFID panels, scales, etc. We want to gain visibility into these devices to influence predictive maintenance and unscheduled downtime. We want to monitor physical devices across the zone from a control tower perspective for end users and support teams alike. Understanding more about the performance of the devices and mechanical components will allow us to schedule downtime to fix imminent catastrophic failures and prevent unplanned downtime and lost revenue.

How has it helped my organization?

Previously, we had no visibility into the architectural layout of our infrastructure. The UI of Datadog has allowed for increased visibility and access to broken or underperforming resources or critical pieces of infrastructure. Beyond this, it has allowed us to identify areas where we can optimize cost in our cloud infrastructure.

What is most valuable?

The most valuable features I have found are network monitoring, testing, and integration tools. The visibility into our network has allowed for quick diagnosis of failures, identification of underutilized or over-utilized resources, and allowed for cloud cost optimization opportunities. The ability to correlate metrics has proven useful in determining downstream or upstream issues influencing the device, machine, or database having issues.

What needs improvement?

I would love to see more metrics or analytics in IoT devices. 

For how long have I used the solution?

I've been using the solution for approximately two years.

What do I think about the stability of the solution?

I have never experienced an issue or outage.

What do I think about the scalability of the solution?

The solution is very scalable and developed in a fashion that provides the ability to scale easily.

How are customer service and support?

Customer service has been outstanding. They have been timely and knowledgeable with all of my questions.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We used a different product for the total stack solution.

How was the initial setup?

The initial setup was straightforward.

What about the implementation team?

We handled the setup process in-house.

What was our ROI?

I'm unsure as to if we've seen an ROI.

Which other solutions did I evaluate?

We did evaluate SolarWinds.

Which deployment model are you using for this solution?

Hybrid Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Download our free Datadog Report and get advice and tips from experienced pros sharing their opinions.
Updated: March 2025
Buyer's Guide
Download our free Datadog Report and get advice and tips from experienced pros sharing their opinions.