No more typing reviews! Try our Samantha, our new voice AI agent.
reviewer2002896 - PeerSpot reviewer
VP at a financial services firm with 10,001+ employees
Real User
Oct 31, 2022
Good monitoring, dashboards, and flame graphs
Pros and Cons
  • "The most valuable aspect is the APM which can monitor the metrics and latencies."
  • "So far, the solution works very well and solves most of the problems we have."
  • "The correlation between the logs and the metrics needs improvement as most cases, we might use another logging tool (that is cheaper in cost) which we then have to link together."
  • "The correlation between the logs and the metrics needs improvement as most cases, we might use another logging tool (that is cheaper in cost) which we then have to link together."

What is our primary use case?

The product is used for APM solutions for the metrics and traces for the REST API requests and service maps to understand the upstream and downstream services.

We are creating dashboards and widgets to monitor the status. We are creating alerts and monitors as well. We integrated the alerts and ticketing system in our organization with SNOW and Netcool.

We are using Kubernetes, AWS, and infrastructure metrics. We are using Kafka and Aurora Postgres logs as well, and we are using HTTP status codes to identify the error types.

How has it helped my organization?

So far, the solution works very well and solves most of the problems we have. Currently, we are trying to integrate the trace ID into Datadog and correlate the logs and metrics. However, Datadog is not supporting the spring-generated trace IDs, and they are not shown in the Datadog UI. It works in reverse. This means Datadog injects the DD-specific trace ID into the application logs, and those logs can be in other tools, for example, Cloud Watch and Splunk. 

What is most valuable?

The most valuable aspect is the APM which can monitor the metrics and latencies. There's a low error rate, and any alerts can be tagged to the service requests and sent via email to the required DLs. 

We can create incidents as well in our internal tools, like SNOW and Netcool.

The monitoring enables different dimensions of metrics to monitor the services and infrastructure. 

We have cloud infrastructure monitoring in Kubernetes nodes, pods containers, and ingress metrics.

Alerts are sent to an email in case of any issues. The metrics are used to create alerts.

The solution offers good dashboards, service maps, traces and flame graphs, HTTP status codes, power packs, service catalogs, and profiling.

While the logs module is not activated, we are using all other modules.

What needs improvement?

The correlation between the logs and the metrics needs improvement as most cases, we might use another logging tool (that is cheaper in cost) which then we have to link together. 

They can improve the SSO logging as well. Currently, we are logging in every two to three days by sending the login link explicitly.

Buyer's Guide
Datadog
April 2026
Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: April 2026.
886,468 professionals have used our research since 2012.

For how long have I used the solution?

I've been using the solution for two years. 

What do I think about the stability of the solution?

The stability is awesome. 

What do I think about the scalability of the solution?

We are expanding beyond observability right now.

How are customer service and support?

They offer pretty awesome customer support.

Which solution did I use previously and why did I switch?

We did not previously use a different solution.

How was the initial setup?

The initial setup was easy.

What about the implementation team?

We implemented the solution with the help of a vendor team.

What was our ROI?

I'd rate the ROI ten out of ten.

What's my experience with pricing, setup cost, and licensing?

I would recommend Datadog to others.

Which other solutions did I evaluate?

We also evaluated ECE and Splunk.

What other advice do I have?

The solution has a great support model.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
reviewer2003214 - PeerSpot reviewer
Sr. Director of Software Engineering at a tech consulting company with 1,001-5,000 employees
Real User
Oct 31, 2022
Helpful support, good incident management, and helps triage faster
Pros and Cons
  • "The RUM solution has improved our ability to triage faster and hand more capabilities to our customer support."
  • "The RUM solution has improved our ability to triage faster and hand more capabilities to our customer support."
  • "The pricing is a bit confusing."
  • "The pricing is a bit confusing."

What is our primary use case?

The RUM is implemented for customer support session replays to quickly route, triage, and troubleshoot support issues which can be sent to our engineering teams directly. 

Customer Support will log in directly after receiving a customer request and work on the issue. Engineers will utilize the replay along with RUM to pinpoint the issue combined with APM and Infra trace to be able to look for signals to find the direct cause of the customer impact. 

Incident management will be utilized to open a Jira ticket for engineering, and it integrates with ITSM systems and on-call as needed.

How has it helped my organization?

The RUM solution has improved our ability to triage faster and hand more capabilities to our customer support.

The RUM is implemented for customer support. It can quickly route, triage, and troubleshoot support issues that are sent to our engineering teams. 

Customer support can log in and start troubleshooting after receiving a customer request. The replay and RUM help pinpoint the issue. This functionality is combined with APM and Infra trace to be able to look for the cause of the issue. Incident management is leveraged to open a Jira ticket for engineering, and it can integrate with ITSM systems and on-call as needed.

What is most valuable?

RUM with session replay combined with a future use case to support synthetics will help to identify issues earlier in our process. We have not rolled this out yet but plan for it as a future use case for our customer support process. This, combined with integrated automation for incident management, will drive down our MTTR and time spent working through tickets. Overall, we are hoping to use this to look at our data and perfection rate over time in a BI-like way to reduce our customer support headcount by saving on time spent.

What needs improvement?

I would like to see retention options greater than 30-days for session replay. I'd also like to see forwarding options for retention to custom solutions, and a greater ability to event and export data from the tooling overall to BI/DW solutions for reporting across the long term and to see trends as needed.

For how long have I used the solution?

I've used the solution for about nine months.

What do I think about the stability of the solution?

So far, stability has been great.

What do I think about the scalability of the solution?

I'd like to see more bells and whistles added over time. Widgets are coming soon to help with RUM.

How are customer service and support?

Support is very good. They are responsive and gave us the help we need.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We have utilized New Relic, however, not for RUM. We went with Datadog to potentially switch the entire platform into an all-in-one solution that makes sense for a company of our size.

How was the initial setup?

We started on the beta, and the documentation was lagging behind. We also needed direct instructions and links from the customer support/account representative that was not immediately available by searching online.

What about the implementation team?

We implemented the solution ourselves.

What was our ROI?

Ideally, this will inform our strategy to not increase our customer support headcount as significantly into 2023 and beyond.

What's my experience with pricing, setup cost, and licensing?

The pricing is a bit confusing. However, the RUM session replay, in general, is very inexpensive compared to whole solutions.

Which other solutions did I evaluate?

We looked into LogRocket and New Relic.

What other advice do I have?

I'd advise other users to try it out.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Buyer's Guide
Datadog
April 2026
Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: April 2026.
886,468 professionals have used our research since 2012.
reviewer2003286 - PeerSpot reviewer
Software engineer at a marketing services firm with 501-1,000 employees
Real User
Oct 31, 2022
Helps catch bugs, easy for non-technical users, and useful for tracking issues
Pros and Cons
  • "This spectrum of solutions has allowed us to track down bugs faster and more rapidly, which allows us to limit revenue lost during downtime."
  • "This spectrum of solutions has allowed us to track down bugs faster and more rapidly, which allows us to limit revenue lost during downtime."
  • "Datadog could make their use cases more visible either through their docs or tutorial videos."
  • "Datadog could make their use cases more visible either through their docs or tutorial videos."

What is our primary use case?

We use metrics to track the metrics of our application. We use logging to log any errors or erroneous application behavior as well as successful behavior. We use events to log successful steps in our pipeline or failed steps in our deployment. We use a combination of all these features to diagnose bugs. 

It makes it much more efficient to look at all the data in one place. This speeds up our development speed so that we can be agile.

How has it helped my organization?

This spectrum of solutions has allowed us to track down bugs faster and more rapidly, which allows us to limit revenue lost during downtime. 

It also allows us to accurately record and project current and future revenue by measuring the application's metrics. This way, my team can accurately and rapidly create reports for upper management that are easy to read and understand. 

Datadog is also easy to read by non-technical personnel. This way, if there are any erroneous readings, everybody has a chance to find them.

What is most valuable?

We use metrics to track the metrics of our application. We use logging to log any errors or erroneous application behavior as well as successful behavior. We use events to log successful steps in our pipeline or failed steps in our deployment. 

We use a combination of all these features to diagnose bugs. It makes it much more efficient to look at all the data in one place. This speeds up our development speed so that we can be agile.

These features are the features that I use the most since it is incredibly difficult to track down intermittent bugs if I were to look directly under the hood in a CLI.

What needs improvement?

Datadog could make their use cases more visible either through their docs or tutorial videos. There are different implementations of certain features that we utilize to customize Datadog functionality and in that way, we sometimes get results that are not conducive to what Datadog thinks their features' use cases are.

For how long have I used the solution?

I've used the solution for at least one year.

Which solution did I use previously and why did I switch?

We have only used Datadog. We did not previously use a different product.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
reviewer2003355 - PeerSpot reviewer
DevOps Engineer at a printing company with 51-200 employees
Real User
Oct 31, 2022
Great visibility, good logs, and a helpful dashboard
Pros and Cons
  • "For us to have visibility into our app stack and the hardware we run has been highly beneficial."
  • "Our ROI with Datadog has been very high."
  • "I want to applaud the efforts in making the UI extremely usable and approachable. My suggestion would be to take another look at how the menu structure is put together, however. Even after using the platform mostly every day for months, I still find myself trying to find a service or feature in the menus."
  • "My suggestion would be to take another look at how the menu structure is put together, however. Even after using the platform mostly every day for months, I still find myself trying to find a service or feature in the menus."

What is our primary use case?

Log aggregation for us was a key component since we have a fairly old-school app running on VMs on bare metal. We previously didn't have much insight into our logs unless we manually tunneled them into each server.

The solution is reducing manual labor in troubleshooting problems in our environments server by server.

We also needed to monitor our Java app and MySQL database to understand their problems so that we could take action and resolve them.

Our use cases have since expanded to encompass all aspects of monitoring.

How has it helped my organization?

Before Datadog, all we had to go on was the gut reaction of the old guard on our team. While useful, the reactions and inherent knowledge only benefited a few folks.

Datadog has allowed us to create comprehensive dashboards and proactively send out alerts. We used the knowledge of people very versed with our products to help set up the platform and have since benefited from that.

The operative word here is visibility, and we've seen a huge improvement in that.

What is most valuable?

Seeing log trends and patterns and aggregate search was a huge first step for us. We then began using other features of the Datadog platform by enabling APM. After that, we did other integrations.

For us to have visibility into our app stack and the hardware we run has been highly beneficial.

We leverage APM, log management, and at least ten other integrations. Our DB, web servers, network, storage, and other areas are now monitored and hooked up to dashboards.

Dashboarding has also proven useful when information is going to be viewed by anyone in the organization.

What needs improvement?

Our experience has been overwhelmingly positive so far. That said, there is one area that could benefit from some polish. For example, I want to applaud the efforts in making the UI extremely usable and approachable. My suggestion would be to take another look at how the menu structure is put together, however. Even after using the platform mostly every day for months, I still find myself trying to find a service or feature in the menus.

For how long have I used the solution?

I've used the solution for around six or eight months. We've had the Datadog agents deployed on our various environments.

What do I think about the stability of the solution?

So far, we have not had any issues with stability. It should be very stable and easy to update.

What do I think about the scalability of the solution?

The solution is currently deployed on a limited scale. That said, we see the potential and benefits of deploying this in a cloud scenario.

How are customer service and support?

Customer service and the support teams have been very responsive when we need them. They are very professional.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

This was our first solution in this space.

How was the initial setup?

The initial setup steps with the agent are only confusing when using the config files for the first time. The main file includes a lot that you can specify elsewhere and it's not readily apparent which one to use until you dig in more.

What about the implementation team?

We did an in-house implementation.

What was our ROI?

Our ROI with Datadog has been very high. It's given us the ability to see how we're performing, which we didn't have before.

What's my experience with pricing, setup cost, and licensing?

Ensure you have your ingestion pipelines dialed in, or you'll likely spend more than you were expecting.

Which other solutions did I evaluate?

We evaluated free and open-source options, however, ultimately, we decided that we didn't have the manpower as a small company to maintain them.

What other advice do I have?

There is nothing that the documentation cannot help with; it's very good.

Which deployment model are you using for this solution?

Private Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
reviewer2004213 - PeerSpot reviewer
Software enginneer at a construction company with 1,001-5,000 employees
Real User
Oct 31, 2022
Good monitoring, custom tracking, and customizable dashboards
Pros and Cons
  • "The solution has helped our organization with custom events to track specific cases."
  • "Knowing the entry point helps us choose which part of the program should be improved next, and it also helps us with collecting important data about the overall usage of each module within our application."
  • "We need to learn more about the session reply feature inside of DD."
  • "We need to learn more about the session reply feature inside of DD."

What is our primary use case?

We use the solution for monitoring time spent on views and events triggered. For example, for one of our products, we have created a custom dashboard that lets us track all the custom events and multiple entry points in the same part of the application.

Knowing the entry point helps us choose which part of the program should be improved next. It also helps us with collecting important data about the overall usage of each module within our application. 

How has it helped my organization?

The solution has helped our organization with custom events to track specific cases.

It's helped with monitoring time spent on views and events triggered. For example, for one of our products, we have created a custom dashboard that lets us track all the custom events as well as multiple entry points into the same part of the application.

Knowing the entry point helps us choose which part of the program should be improved. It's collecting important data about the overall usage of each module within our application. 

What is most valuable?

The most valuable feature is the custom events to track specific cases.

Monitoring time spent on views and events can be triggered. For example, for one of our products, we have created a custom dashboard that lets us track all the custom events and multiple entry points in the same part of the application.

Knowing the entry point helps us decide on improvements. We can collect important data about the overall usage of each module within our application. 

What needs improvement?

We look forward to the next features from Datadog. We need to learn more about the session reply feature inside of DD.

For how long have I used the solution?

I've used the solution for two years.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
reviewer2004210 - PeerSpot reviewer
Cloud Specialyst at a financial services firm with 501-1,000 employees
Real User
Oct 31, 2022
Centralized with good observability and many modules
Pros and Cons
  • "The most valuable aspect is for us to have everything in one place."
  • "The most valuable aspect for us is to have everything in the same place."
  • "We need a lot of modules since we collect all data logs from all operating systems."
  • "We need a lot of modules since we collect all data logs from all operating systems."

What is our primary use case?

We collect all data logs from all operating systems, such as Windows, Linux, VMware, and bare metal data centers. We also automatize the installation of the agent on servers. 

Now we are starting a POC to analyze the APM module. In the feature, the next step is to do a POC of security modules. 

The final idea is to have a unique portal for observability. This will make it easy to troubleshoot and for layer levels 1 and 2. 

How has it helped my organization?

We are looking into a lot of modules. We collect all data logs from all operating systems, including Windows, Linux, VMware, and bare metal data centers. We also automatize the installation of the agent on servers. 

We're developing POCs for APM and security modules. We'll also have a unique portal for observability. This will make it easy to troubleshoot. 

The most valuable aspect is for us to have everything in one place.

What is most valuable?

We're investigating many modules. We collect all data logs from all operating systems (Windows, Linux, VMware, and bare metal data centers). We also automatize the installation of the agent on servers. 

We're doing POCs in APM and security. 

Soon, we'll have a unique portal for observability. This will make troubleshooting easy at levels 1 and 2. 

The most valuable aspect for us is to have everything in the same place.

What needs improvement?

We need a lot of modules since we collect all data logs from all operating systems. 

The most important module for us is log management. The second is the security module. The third one is the APM.

For how long have I used the solution?

We've used the solution for one year.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
reviewer2000463 - PeerSpot reviewer
Technical Lead at a wholesaler/distributor with 1,001-5,000 employees
Real User
Oct 31, 2022
Great dashboards, easy to tweak, and showcases helpful metrics
Pros and Cons
  • "The ease of correcting these dashboards and widgets when needed is amazing."
  • "Using Datadog metrics has helped the organization a lot in many manners."
  • "The parallel editing of the dashboards should not cause users to lose the work of another person."
  • "The only issue I face is when more than one person editing these dashboards simultaneously, one or the other person sometimes loses his/her work."

What is our primary use case?

We use Datadog for observability and monitoring primarily. Various cross-functional teams have built various dashboards, including Developers, QA, DevOps, and SRE. 

There are also some dashboards created for senior leadership to keep tabs on days to day activities like cost, scale, issues, etc. 

Also, we've set up monitors and alarms that kick off when any metrics go beyond the threshold. With Slack and PagerDuty integration, correct team members get alerted and react to solve the issue based on various runbooks.

How has it helped my organization?

Using Datadog metrics has helped the organization a lot in many manners. With one centralized monitoring place, it's a lot less effort to keep track of the system and applications' health. 

Using this also helps teams be proactive in dealing with any issues before they get escalated by customers. 

Lastly, having so many integrations makes the DevOps and SRE's lives a lot easier when automating the detection and resolution of any issues hidden in the system or applications. Overall, it has helped a lot.

What is most valuable?

My favorite feature is creating dashboards as that empowers me to sleep calmly at night and not to keep watch on critical system metrics. Be it DB metrics or computer-related metrics, it's always easy to view them. 

The ease of correcting these dashboards and widgets when needed is amazing. 

The only issue I face is when more than one person editing these dashboards simultaneously, one or the other person sometimes loses his/her work. That said,  they will resolve that soon. With the variety of widgets, it's so easy to plot the data in a timely manner, and that makes monitoring a lot easier.

What needs improvement?

The solution can be improved in a few areas. 

The parallel editing of the dashboards should not cause users to lose the work of another person. 

Secondly, we would like to see more demos of tools that are in beta version, when they come live. I am sure they will help us a lot.

For how long have I used the solution?

I've been using the solution for slightly over two years.

What do I think about the stability of the solution?

I find the solution to be very stable.

What do I think about the scalability of the solution?

I totally love it. It is scalable. 

Which solution did I use previously and why did I switch?

We previously used Sumo Logic.

How was the initial setup?

The initial setup is not so difficult.

What about the implementation team?

We implemented the solution in-house.

What was our ROI?

The ROI is very fair so far.

What's my experience with pricing, setup cost, and licensing?

I can't recommend the licensing.

Which other solutions did I evaluate?

I was not involved in any pre-evaluation process.

Which deployment model are you using for this solution?

Hybrid Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
reviewer2000472 - PeerSpot reviewer
Security Engineering Manager at a computer software company with 11-50 employees
Real User
Oct 31, 2022
Democratizes observability, great log searchability, and intuitive UI
Pros and Cons
  • "I find the greatest feature is being able to search across logs from various microservices."
  • "The greatest impact it has had is on the ability to democratize observability and put monitoring into the hands of the people."
  • "One area where I was really looking for improvement was the CSPM product line. I had really wanted to have team-level visibility for findings, since the team managing the resources has much more context and ability to resolve the issue, as the service owner. However, this has been added to the announcement in a recent keynote."
  • "One area where I was really looking for improvement was the CSPM product line."

What is our primary use case?

I use the solution to manage security-related logs and metrics, as well as create detection rules for security events. I am a security engineer, so one area of interest is the CSPM product, giving us the ability to look at findings across the cloud environment. 

The great part about the Datadog security products is that they incorporate the context of the resources/hosts where the security event is found. This allows us to see exactly what is running on a host that we see as a security alert.

How has it helped my organization?

The greatest impact it has had is on the ability to democratize observability and put monitoring into the hands of the people. Teams can quickly get the information they need, without needing a bunch of training, since the UI is super intuitive and easy for beginners. This helps reduce time to resolution during incidents and gives context to developers quickly and easily. Context is really important since seconds matter when the ship is down, and you don't know why.

What is most valuable?

I find the greatest feature is being able to search across logs from various microservices. As a member of the security team, I find that I often need visibility into other teams' services in order to get a good picture of our security posture.

I also am a fan of the ability to easily create monitors and get alerts into Slack quickly, without too much overhead. For example, I often need to create monitors where I am not too sure where the baseline lies. Having the ability to create anomaly monitors makes this process much more straightforward. Anomaly monitors are great for a security team.

What needs improvement?

One area where I was really looking for improvement was the CSPM product line. I had really wanted to have team-level visibility for findings, since the team managing the resources has much more context and ability to resolve the issue, as the service owner. However, this has been added to the announcement in a recent keynote. 

For how long have I used the solution?

Personally, I've used it my entire time employed here, more than three years.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Buyer's Guide
Download our free Datadog Report and get advice and tips from experienced pros sharing their opinions.
Updated: April 2026
Buyer's Guide
Download our free Datadog Report and get advice and tips from experienced pros sharing their opinions.