Try our new research platform with insights from 80,000+ expert users
it_user147573 - PeerSpot reviewer
CTO with 51-200 employees
Vendor
We can build dashboards as fast we roll out new systems, which can be fast.
Pros and Cons
  • "The most valuable features have been: Sharable dashboards, TimeBoards, dogstatsd API, Slack Integration, Event logging API. CloudTrail Events, Tags, alerts, and anomaly detection. EBS Volume Snapshot Age, which they added upon request."
  • "More granular control over dashboard sharing. Timeboard sharing."

How has it helped my organization?

We can build dashboards as fast we roll out new systems, which can be fast.

We use standard and custom metrics for every new system we roll out for 360 degree visibility into our systems.

What is most valuable?

The most valuable features have been: Sharable dashboards, TimeBoards, dogstatsd API, Slack Integration, Event logging API. CloudTrail Events, Tags, alerts, and anomaly detection. EBS Volume Snapshot Age, which they added upon request. We used PagerDuty integration for a while as well.

What needs improvement?

More granular control over dashboard sharing. Timeboard sharing.


What do I think about the stability of the solution?

There are infrequent hiccups, which have been decreasing over the time we have used it.

Buyer's Guide
Datadog
March 2025
Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: March 2025.
839,422 professionals have used our research since 2012.

What do I think about the scalability of the solution?

No.

How are customer service and support?

Customer Service:

Never seen better. Questions answered usually almost immediately, even on weekends. An in-stream with your event stream.

Technical Support:

High.

Overall they have always had an amazing team, and quality has been maintained as the company has grown.

Which solution did I use previously and why did I switch?

Complementary to other tools we used.

How was the initial setup?

Setup is generally easy. They provide an large number of integrations, some are more complex than others, which is to be expected.

What about the implementation team?

In house implementation.

What was our ROI?

We didn’t calculate explicitly, but as we used the product to track down underutilized instances, it more than paid for itself in the first month.

What's my experience with pricing, setup cost, and licensing?

Pricing overall in this segment has standardized in the last several years.

Which other solutions did I evaluate?

A few, including Zabbix and Icinga.

What other advice do I have?

One of the fastest and most flexible tools we have used in this area..

Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
PeerSpot user
reviewer2045043 - PeerSpot reviewer
Software Engineer at a comms service provider with 5,001-10,000 employees
Real User
Great monitors and APM with helpful Terraform support
Pros and Cons
  • "APM is great and has provided low-effort out-of-the-box observability for various services."
  • "Delta traces on the Golang profiler are extremely expensive concerning memory utilization."

What is our primary use case?

We primarily use the product for tracing, metrics, and alarms in various deployment environments.

How has it helped my organization?

The product has provided our company with improved observability, which has helped make the incident response more targeted and quicker.

What is most valuable?

APM is great and has provided low-effort out-of-the-box observability for various services. 

Monitors are helpful, and definitions are simple. 

Terraform support is nice as it allows us to create homogenous monitoring environments in various deployment environments with little additional effort. It also facilitates version control of monitor definitions, etc. 

The Golang profiler is generally good with the exception of delta profiles; it has provided helpful observability into Heap Allocations which has helped us reduce GC overhead.

What needs improvement?

Delta traces on the Golang profiler are extremely expensive concerning memory utilization. In a Kubernetes environment where we would like to set per-pod memory allocations as low as possible, the overhead of that profiler feature is prohibitive. In one case, our pods (which were provisioned to target 250 MB and max at 500 MB memory) got stuck in a crash loop due to out-of-memory, which was caused entirely by the delta profiles feature of the profiler.

Multistep Datadog synthetics lack the feature of basic arithmetic. For our use case, performing basic arithmetic on the output of previous steps to produce input for subsequent steps would be extremely useful.

For how long have I used the solution?

I've used the solution for nine months.

Which deployment model are you using for this solution?

Private Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Datadog
March 2025
Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: March 2025.
839,422 professionals have used our research since 2012.
reviewer2045022 - PeerSpot reviewer
Software Engineer at a financial services firm with 501-1,000 employees
Real User
Great UI and documentation but needs to offer K8s deployment monitoring in real-time
Pros and Cons
  • "The installation step is pretty straightforward."
  • "I'm not sure if Datadog can monitor K8s deployments in real-time. For instance, being able to see a deployment step by step visually. This would be helpful if there were any incidents during the deployment."

What is our primary use case?

We use Datadog to monitor our Kubernetes clusters. 

We have 3 different clusters for different parts of the SDLC. We run the Datadog agent DaemonSet as well as the Datadog cluster agent. Our services have the APM installed by default. 

To create monitors, we use Terraform. This is provided out-of-the-box for our service owner. 

We run EKS on top of K8s, therefore, we also make use of some of the AWS monitoring capabilities that can be integrated into Datadog. 

We are hugely reliant on Datadog for all aspects of our system.

How has it helped my organization?

With Datadog, we were able to gain observability in our system. 

The installation step is pretty straightforward. 

It's easy to use by non-DevOps users. For instance, our engineers do not interact with K8s often; therefore, it is hard for them to debug. However, with Datadog, they are able to view their containers and deployments with a single click. 

We also heavily use the tags to help us identify who the service owners are. This is super useful when we need to track owners for patching or pick up new features we implemented.

What is most valuable?

The APM and K8s monitoring are the most valuable aspects of the solution. The K8s monitoring allows all customers to view their infra, even if they do not use K8s daily. They can just click on a few tabs to get all of the information they need. 

It is also very easy to install on our system. APM has helped debug applications on our system as well. We were able to view why a service has suddenly shut down.

We also use Datadog for SLOs/SLAs as well. We check the live endpoint of services to ensure they are still up and running.

What needs improvement?

There is not much that needs to be improved. 

The UI is super user-friendly. The deployment process is easy. We enjoy using the integrations with Slack and PagerDuty. 

Customer support is awesome from our experience. There is a lot of documentation for us to be able to use if we need to. 

I'm not sure if Datadog can monitor K8s deployments in real-time. For instance, being able to see a deployment step by step visually. This would be helpful if there were any incidents during the deployment. 

In general, Datadog is a great solution.

For how long have I used the solution?

I've used Datadog since I joined my company about a year ago.

What do I think about the stability of the solution?

We haven't had issues with the stability.

What do I think about the scalability of the solution?

The scalability is really great.

How are customer service and support?

We've had no issues with the product or support. 

How was the initial setup?

The initial setup is super simple, and the documentation was helpful.

What about the implementation team?

We managed the initial setup process in-house.

What was our ROI?

We've witnessed ROI in our DevOps.

Which deployment model are you using for this solution?

Private Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
reviewer2045010 - PeerSpot reviewer
Lead Application Developer at a retailer with 10,001+ employees
Real User
Good logging and APM and useful for troubleshooting
Pros and Cons
  • "The most valuable aspect of the solution is the APM."
  • "The logging could be improved in the future."

What is our primary use case?

We primarily use the solution for monitoring and log analysis.

How has it helped my organization?

Datadog shows all the logs for the services, and it is very useful for troubleshooting.

What is most valuable?

The most valuable aspect of the solution is the APM.

The logging capabilities are quite useful. 

What needs improvement?

The logging could be improved in the future. 

For how long have I used the solution?

I've used the solution for four years.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
reviewer2004192 - PeerSpot reviewer
Lead Support Engineer at a tech vendor with 11-50 employees
Real User
Good centralization of data with good integration but can be overwhelming at first
Pros and Cons
  • "The integration into AWS is key as well as our software is currently bound to AWS."
  • "The ability to find what you are looking for when starting out could be improved."

What is our primary use case?

Our use case is mainly deploying into our applications for monitoring/logging observability. We currently have our microservices feed into an actuator that exists in each instance of our application that extends to a local and central Grafana for client and internal visibility. The application we use is Grafana.

Logging captures application and system logs that are ported to each application instance for querying.

Whenever anything occurs that is considered unhealthy from a range of health checks, we have notification rules configured internally and externally for a prompt response time.

How has it helped my organization?

We have been able to be a more confident, knowledgeable, and capable team when everything is being ported into a centralized format. Beforehand, knowledge was isolated to individuals. Knowledge in terms of what information represented and where it was led to a lack of confidence. By having everything in one place, rules out that confusion and allows us to respond better to issues.

It also allows for personal growth as our team is learning the application from the ground up, and each person is enhancing their own skills.

What is most valuable?

The valuable features include the following: 

  • We are currently utilizing a decentralized distributed framework for our deployment, including our monitoring/logging observability capabilities. Centralizing them, if contingent on our company privacy guidelines, will be a big help in tracking and responding to issues that come up and have the means to understand the origin of the log management tools that were demonstrated.
  • The ability to fiddle around and manipulate how logs are outputted.
  • The ability to track AWS Lambda functions, Cloudformation, and Cloudwatch allow someone that is not savvy to dip their toe into understanding their own product.
  • The integration into AWS is key as well as our software is currently bound to AWS.

What needs improvement?

The ability to find what you are looking for when starting out could be improved. It was a bit overwhelming trying to figure out what is the best solution. It led to many prototypes or time spent just perusing documentation. If we were able to select bundles or template use cases, we would hit the ground running quicker.

For how long have I used the solution?

I've used the solution for one year.

Which deployment model are you using for this solution?

Private Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
reviewer2003937 - PeerSpot reviewer
Cloud Engineer at a tech services company with 10,001+ employees
Real User
Intuitive with high availability and good integrations
Pros and Cons
  • "The network map is crucial in identifying bottlenecks and determining what needs more attention."
  • "To be very fair, I haven't had enough experience with Datadog to pick out improvements."

What is our primary use case?

We are using the solution for scaling up the website for market data applications. EC2 and Datadog have enabled high-level monitoring of underlying infra and services.

The Datadog profiler comes in handy to pinpoint issues with resource utilization during peak hours, and traces/log management helps narrow down the root cause.

The network map is crucial in identifying bottlenecks and determining what needs more attention.

Host map helps identify problematic hardware and devise ways to counter issues that arise during scaling, and deploying solutions on the cloud.

How has it helped my organization?

While my team is relatively new to Datadog, I already see immense value in switching over to Datadog as the primary APM and NPM tool.

The arsenal of features it offers is bound to come in a clutch when facing production issues, and when finding out what went wrong is crucial.

The network map has helped to figure out the golden signals and optimize the infrastructure.

The synthetics have helped ensure the high availability of arch functions as intended.

What is most valuable?

The network map is useful. With it, we have the ability to see the data flow across the entire network path across all the applications is highly valuable as the data from this service helps identify network bottlenecks, non-performant applications, and bad endpoints.

This is especially crucial for a high-availability website aimed at market data applications where low latency is crucial.

The host map gives a clear picture of the entire infrastructure, and the ability to switch between logs, metrics, and traces is very handy when it comes to debugging issues on the fly.

I love the ability to install the integrations and agents quickly. This is a well-made product.

What needs improvement?

To be very fair, I haven't had enough experience with Datadog to pick out improvements.

My involvement with Datadog has largely been positive. I love the simplicity and intuitiveness it offers - even for nontechnical folks who just might be starting out with developing technical chops in their domain.

For how long have I used the solution?

I've used the solution for three years.

Which deployment model are you using for this solution?

Hybrid Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
reviewer2003286 - PeerSpot reviewer
Software engineer at a marketing services firm with 501-1,000 employees
Real User
Helps catch bugs, easy for non-technical users, and useful for tracking issues
Pros and Cons
  • "This spectrum of solutions has allowed us to track down bugs faster and more rapidly, which allows us to limit revenue lost during downtime."
  • "Datadog could make their use cases more visible either through their docs or tutorial videos."

What is our primary use case?

We use metrics to track the metrics of our application. We use logging to log any errors or erroneous application behavior as well as successful behavior. We use events to log successful steps in our pipeline or failed steps in our deployment. We use a combination of all these features to diagnose bugs. 

It makes it much more efficient to look at all the data in one place. This speeds up our development speed so that we can be agile.

How has it helped my organization?

This spectrum of solutions has allowed us to track down bugs faster and more rapidly, which allows us to limit revenue lost during downtime. 

It also allows us to accurately record and project current and future revenue by measuring the application's metrics. This way, my team can accurately and rapidly create reports for upper management that are easy to read and understand. 

Datadog is also easy to read by non-technical personnel. This way, if there are any erroneous readings, everybody has a chance to find them.

What is most valuable?

We use metrics to track the metrics of our application. We use logging to log any errors or erroneous application behavior as well as successful behavior. We use events to log successful steps in our pipeline or failed steps in our deployment. 

We use a combination of all these features to diagnose bugs. It makes it much more efficient to look at all the data in one place. This speeds up our development speed so that we can be agile.

These features are the features that I use the most since it is incredibly difficult to track down intermittent bugs if I were to look directly under the hood in a CLI.

What needs improvement?

Datadog could make their use cases more visible either through their docs or tutorial videos. There are different implementations of certain features that we utilize to customize Datadog functionality and in that way, we sometimes get results that are not conducive to what Datadog thinks their features' use cases are.

For how long have I used the solution?

I've used the solution for at least one year.

Which solution did I use previously and why did I switch?

We have only used Datadog. We did not previously use a different product.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
reviewer2004213 - PeerSpot reviewer
Software enginneer at a construction company with 1,001-5,000 employees
Real User
Good monitoring, custom tracking, and customizable dashboards
Pros and Cons
  • "The solution has helped our organization with custom events to track specific cases."
  • "We need to learn more about the session reply feature inside of DD."

What is our primary use case?

We use the solution for monitoring time spent on views and events triggered. For example, for one of our products, we have created a custom dashboard that lets us track all the custom events and multiple entry points in the same part of the application.

Knowing the entry point helps us choose which part of the program should be improved next. It also helps us with collecting important data about the overall usage of each module within our application. 

How has it helped my organization?

The solution has helped our organization with custom events to track specific cases.

It's helped with monitoring time spent on views and events triggered. For example, for one of our products, we have created a custom dashboard that lets us track all the custom events as well as multiple entry points into the same part of the application.

Knowing the entry point helps us choose which part of the program should be improved. It's collecting important data about the overall usage of each module within our application. 

What is most valuable?

The most valuable feature is the custom events to track specific cases.

Monitoring time spent on views and events can be triggered. For example, for one of our products, we have created a custom dashboard that lets us track all the custom events and multiple entry points in the same part of the application.

Knowing the entry point helps us decide on improvements. We can collect important data about the overall usage of each module within our application. 

What needs improvement?

We look forward to the next features from Datadog. We need to learn more about the session reply feature inside of DD.

For how long have I used the solution?

I've used the solution for two years.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Download our free Datadog Report and get advice and tips from experienced pros sharing their opinions.
Updated: March 2025
Buyer's Guide
Download our free Datadog Report and get advice and tips from experienced pros sharing their opinions.