Try our new research platform with insights from 80,000+ expert users
reviewer1996521 - PeerSpot reviewer
Engineering Manager at Indeed.com
User
Transparent, easy to use, and integrates well with Slack
Pros and Cons
  • "Datadog's seamless integration with Slack and PagerDuty helped us to receive alerts right to the most common notification methods we use (our mobile devices and Slack)."
  • "I would like better navigability across pages."

What is our primary use case?

I primarily use the solution to learn, watch and monitor business and engineering metrics in the production and QA environments of my team. 

We create monitors on key business metrics and observe regressions and anomalies.

Less often, I leverage the events ability in Datadog to get notified about significant activities happening in my teams' deployments.

We learn about Datadog monitor alerts through Slack and often attempt to create SLOs using Terraform.

We use APM for observability.

Most recently, I learned about WatchDog Alerts that I will be heavily looking into.

How has it helped my organization?

Datadog simplified my ability to watch easily and add monitors on any metric emitted by any team at my organization.

Datadog APM immensely improved our ability to understand the reasons behind production issues. Its ability to navigate across services seamlessly to understand the time spent at each critical stage of a production request is helpful. This, combined with Datadog's historical ability to show business metrics aside, helped get more powerful insights much more quickly.

Datadog's seamless integration with Slack and PagerDuty helped us to receive alerts right to the most common notification methods we use (our mobile devices and Slack).

What is most valuable?

The most valuable aspects include:

  • The ability to monitor any team's metric in my company (transparency)
  • The ability to create/clone dashboards for myself (ease of use)
  • Its integration with Slack (it is very powerful)
  • The ability to add monitors on any metric emitted by any team at my organization
  • (Through Datadog APM) the ability to understand the reasons behind production issues. Its ability to navigate across services seamlessly in order to understand the time spent at each critical stage of a production request is key. This, combined with Datadog's historical ability to show business metrics aside, helped me get more powerful insights much more quickly.
  • (Through integrations like Slack and PagerDuty) the ability to receive alerts right to the most common notification method we use (our mobile devices and Slack), which saves a lot of time and helps us maintain focus. 

What needs improvement?

I would like better navigability across pages. The UI/UX is powerful, yet less intuitive. A lot of times, I somehow navigate across buttons and pages, and I end up forgetting how to get back to a particular view that was more insightful. 

Particularly as Datadog starts offering more platform capabilities like APM, Watchdog, Shift left initiatives like instrumentation, continuous testing, intelligent test runner, and Synthetic and real user monitoring, the UI can become more and more clunky, giving users a very frustrating experience. 

Buyer's Guide
Datadog
January 2025
Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: January 2025.
825,609 professionals have used our research since 2012.

For how long have I used the solution?

I've used the solution for five to six years.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
reviewer2000448 - PeerSpot reviewer
Senior Manager at a manufacturing company with 10,001+ employees
Real User
Great network monitoring, testing, and integration tools
Pros and Cons
  • "The visibility into our network has allowed for quick diagnosis of failures, identification of underutilized or over-utilized resources, and allowed for cloud cost optimization opportunities."
  • "I would love to see more metrics or analytics in IoT devices."

What is our primary use case?

This solution is for physical device monitoring across breweries, including PLCs, HMI Cameras, RFID panels, scales, etc. We want to gain visibility into these devices to influence predictive maintenance and unscheduled downtime. We want to monitor physical devices across the zone from a control tower perspective for end users and support teams alike. Understanding more about the performance of the devices and mechanical components will allow us to schedule downtime to fix imminent catastrophic failures and prevent unplanned downtime and lost revenue.

How has it helped my organization?

Previously, we had no visibility into the architectural layout of our infrastructure. The UI of Datadog has allowed for increased visibility and access to broken or underperforming resources or critical pieces of infrastructure. Beyond this, it has allowed us to identify areas where we can optimize cost in our cloud infrastructure.

What is most valuable?

The most valuable features I have found are network monitoring, testing, and integration tools. The visibility into our network has allowed for quick diagnosis of failures, identification of underutilized or over-utilized resources, and allowed for cloud cost optimization opportunities. The ability to correlate metrics has proven useful in determining downstream or upstream issues influencing the device, machine, or database having issues.

What needs improvement?

I would love to see more metrics or analytics in IoT devices. 

For how long have I used the solution?

I've been using the solution for approximately two years.

What do I think about the stability of the solution?

I have never experienced an issue or outage.

What do I think about the scalability of the solution?

The solution is very scalable and developed in a fashion that provides the ability to scale easily.

How are customer service and support?

Customer service has been outstanding. They have been timely and knowledgeable with all of my questions.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We used a different product for the total stack solution.

How was the initial setup?

The initial setup was straightforward.

What about the implementation team?

We handled the setup process in-house.

What was our ROI?

I'm unsure as to if we've seen an ROI.

Which other solutions did I evaluate?

We did evaluate SolarWinds.

Which deployment model are you using for this solution?

Hybrid Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Datadog
January 2025
Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: January 2025.
825,609 professionals have used our research since 2012.
reviewer2045034 - PeerSpot reviewer
Sr. Manager - DevOps at a aerospace/defense firm with 10,001+ employees
Real User
Excellent RUM, session replay, and APM
Pros and Cons
  • "The solution has helped out organization gain improved visibility."
  • "The product needs a better Datadog agent installation."

What is our primary use case?

We primarily use the solution for logging and APM, and for real user metrics.

How has it helped my organization?

The solution has helped out organization gain improved visibility.

What is most valuable?

The most useful aspects of the solution include RUM, session replay, and APM.

What needs improvement?

The product needs a better Datadog agent installation.

For how long have I used the solution?

I've used the solution for one year.

Which solution did I use previously and why did I switch?

We previously used App Dynamics.

Which other solutions did I evaluate?

Before choosing Datadog, we looked at Splunk.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
James Baird - PeerSpot reviewer
Infrastructure Engineer at a tech services company with 11-50 employees
Real User
Easy to use, simple to set up, and allows for easy visibility
Pros and Cons
  • "Datadog has so far been a breeze to use and set up."
  • "One thing we have run into is that it is so easy to add monitoring that we turn on things without really understanding the costs."

What is our primary use case?

We currently use it for log aggregation and SEIM. We send logs from our AWS account (particularly our Cloudtrail and S3 logs) and use them to give us security signals. 

This has helped with our SOC2 certification process and has given us a window into our processes and the security holes in our system. 

We are also considering using the APM features to help with our development effort. We want to be able to profile all of our code and see what is going on with it.

How has it helped my organization?

It has allowed us to see into our systems with ease. We are a very small startup (Less than 30 people, and most of them are in sales and marketing). 

When it comes to managing systems, we just don't have time to do everything. However, Datadog has allowed us to do much more with fewer people and still sift through our data with ease. 

We hope to start using the APM feature set to extend this to our dev teams as well.

What is most valuable?

The ease of use is the primary aspect. I have used, at previous jobs, the ELK stack and Splunk for log management. Both of them were useful, yet required a lot of manual effort to get set up (and a lot of continuing effort to tweak. A simple monitoring solution turned into a full-time job! However, Datadog has so far been a breeze to use and set up. It looks at what I am sending it and figures out what it is almost by magic. Even the manual configuration makes sense and gives very fast and thorough results

What needs improvement?

One thing we have run into is that it is so easy to add monitoring that we turn on things without really understanding the costs. 

I would like a way to show a continuous indication of what my setup will cost on a daily or weekly basis.

For how long have I used the solution?

I've used the solution for six months.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
reviewer1318287 - PeerSpot reviewer
IT Test Manager at a transportation company with 10,001+ employees
Real User
Very good documentation provided along with regular new features
Pros and Cons
  • "Datadog is constantly adding new features."
  • "Lacks some flexibility in the customization."

What is our primary use case?

Our primary use case is log management and we also use the solution for monitoring the application and underlying infrastructure. I'm an IT test manager. 

What is most valuable?

I appreciate that they are constantly adding new features, some of which we haven't yet had a chance to implement. 

What needs improvement?

I'd like to see more flexibility in the customization and they have a few settings which need to be changed but we are unable to make those changes as users or as the administrator. The tagging to get the different parts of the monitoring interconnected is a bit tricky and takes time to work out. 

For how long have I used the solution?

I've been using this solution for 18 months. 

What do I think about the stability of the solution?

The stability is good. 

What do I think about the scalability of the solution?

I would say that the amount that we are monitoring is not that large and we've never had any scalability issues. We have around 50 users in our department. 

How are customer service and support?

The availability or accessibility to customer service is not always good, although they generally provide solutions once you do manage to get hold of them. 

Which solution did I use previously and why did I switch?

We have previously used different tools for different parts of the monitoring. We changed to AWS when we moved to the cloud. We also found that the effort in maintaining Grafana and Prometheus and keeping it up to date was taking too much time.

How was the initial setup?

The initial setup was straightforward, we used a service provider and they also maintain our operation in general.

What's my experience with pricing, setup cost, and licensing?

We have a four-year contract with Datadog, and the solution is pay-as-you-use. 

What other advice do I have?

I would suggest using the documentation, which is quite good. It's best to start with existing integrations, and then do the customization step-by-step.

I rate this solution eight out of 10. 

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
reviewer1479957 - PeerSpot reviewer
Senior Director of DevOps at Housecall Pro
Real User
Good graphing and dashboards, and it improves visibility for developers
Pros and Cons
  • "Having a wealth of information has helped us investigate outages, and having historical data helps us tune our system."
  • "Datadog has a lot of documentation, but a lot of that documentation assumes you know how the service works, which can lead to confusion."

What is our primary use case?

We primarily use Datadog for the monitoring of EC2 and ECS containers running mostly Rails applications that host a SaaS product. We also monitor ElasticSearch and RDS, and we are working on adding their Application Performance Monitoring solution to monitor our applications directly.

We use DataDog to create dashboards, graphs, and alerts based on interesting metrics. DataDog is our first place to look to find the performance of our system.

We also use their logging platform and it works well. Especially useful is that the logs and metrics are tightly integrated so you can jump between them easily.

How has it helped my organization?

Developers are able to see how code is running in production, where this was mostly opaque previous to us implementing DataDog. We are able to emit custom metrics that are specific to our business, and the built-in metrics have also proven useful. Having a wealth of information has helped us investigate outages, and having historical data helps us tune our system.

DevOps engineers are able to put sensors around our system to proactively detect problems, whereas before, our engineers heard about problems from customers. Logs are easier to find for developers.

What is most valuable?

Metric graphing and Dashboards are the most valuable features because they give us good observability into our system and work well to alert us when interesting things happen. We use this functionality daily.

We value the monitoring capability since it allows us to be pushed alerts, rather than have to observe graphs continually. The integrations with Slack and PagerDuty enable us to be interrupted appropriately and keep a running tab on the system without bothering us unnecessarily.

The online process monitoring has been extremely helpful, as it gives engineers the ability to see the live status of all the processes running our systems without them having to log in.

What needs improvement?

Their logging solution is expensive for our use case. They do have the capability to rehydrate old or incomplete logs, and it works, but I would rather not have to think about that operation.

Datadog has a lot of documentation, but a lot of that documentation assumes you know how the service works, which can lead to confusion. Positive note is that they do have lots of documentation, it just needs better curation.

Their APM solution still needs some work, but they are actively developing it. I would also like to see more database-specific application monitoring.

For how long have I used the solution?

I have been using Datadog for five years across two companies.

What do I think about the stability of the solution?

Any issues are addressed and communicated very quickly. I have not had any issues with uptime.

What do I think about the scalability of the solution?

If you do not need 100% of data such as logs, APM traces, etc., this scales well. It does not scale as well if you want 100% of your logs indexed. You should understand any other usage-based bills before using any part of their service as it is very easy to run up a large bill.

The performance of the system scales very well, and host monitoring and APM are relatively cheap.

How are customer service and technical support?

Account support is excellent.

Customer support is good if you get them to go beyond pointing out the right documentation.

Which solution did I use previously and why did I switch?

Previously, I used homebuilt solutions with Nagios and Cacti but found that there was far too much work to understand them and keep them up and fed compared to the value that I got. They also did not integrate well with existing data sources without a lot of effort.

I also previously used StackDriver and found it too opinionated. I like that DataDog gives you tools to work with certain types of data and make your own graphs, monitors, etc., whereas, with StackDriver, I felt like there were a limited number of ways you could accomplish goals.

How was the initial setup?

The basic setup is easy. A more advanced setup can be tricky because the documentation assumes you know how the system works already. Support is somewhat helpful, but mostly points out the documentation you should already have found.

What about the implementation team?

We implemented in-house.

What's my experience with pricing, setup cost, and licensing?

My advice is to understand what number of hosts and data you want to commit to. Beware that usage-based billing is both a blessing and a curse. It is easy to run up a large bill, so become familiar with the cost of each piece of your bill and use the metrics they supply to estimate and monitor your bill.

I have had good luck with their support team helping us to figure out the correct commit levels. Their account support is excellent in this regard. I have heard their sales team can be aggressive, but I have not experienced it personally.

Which other solutions did I evaluate?

I originally chose Datadog because of my previous experience. We recently considered moving over to New Relic because we liked their APM solution better. However, the pricing of New Relic and our familiarity with Datadog won over. New Relic is a good product but it didn't fit our overall needs as well as Datadog.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Site Reliability Engineer at a computer software company with 201-500 employees
Real User
They have a good ecosystem for their integrations
Pros and Cons
  • "Their interface is probably one of the easiest things to use because it lets non-developers and non-engineers quickly get access to metrics and pull business value out of them. We could put together dashboards and give it to people who are non-technical, then they can see the state of the world."
  • "We have been able to set very specific CPU and memory alerts, at the very base level, then we started to pull real business value, like 99th percentile response rates for our API calls."
  • "It has turned into an operational dashboard. If you felt something is going wrong, you can immediately open up Datadog. It has been our go to application because we know the answer will be there."
  • "The way data is represented can be limiting. When I first tried it out a long time ago, you could graph a metric and another metric, and they'd overlay, but you couldn't take the ratio between the two."
  • "When I started using it years ago, it had stability problems. I remember, specifically, we ran everything in Docker containers. There were some problems getting it into a Docker container with very specific memory limits."

What is our primary use case?

We use it for custom metrics of our applications and monitoring of our systems.

How has it helped my organization?

My current company didn't have very good monitoring in the past. We had been using basic CPU monitoring. We have been able to set very specific CPU and memory alerts, at the very base level, then we started to pull real business value, like 99th percentile response rates for our API calls. 

It has turned into an operational dashboard. If you felt something is going wrong, you can immediately open up Datadog. It has been our go to application because we know the answer will be there.

What is most valuable?

Their interface is probably one of the easiest things to use because it lets non-developers and non-engineers quickly get access to metrics and pull business value out of them. We could put together dashboards and give it to people who are non-technical, then they can see the state of the world. 

They have a very good ecosystem for their integrations. They have a lot of different integrations, and we use a lot of them. We have integrations with Amazon for ECS, RDS, and all of the subsystems of Amazon. We also have Docker and Splunk integrations. The integrations are great because they're definitely vetted and not third-party integrations. They're part of the Datadog ecosystem and seamless.

What needs improvement?

The way data is represented can be limiting. They have added their own little query language that you can use to manipulate things, so you can graph and relate two different metrics together. This is relatively new this year. When I first tried it out a long time ago, you could graph a metric and another metric, and they'd overlay, but you couldn't take the ratio between the two. However, it looks like this is the direction that they're going, and that's a good direction. I think they should continue adding things that way.

I like being able to put the formulas in myself. I don't want the average. I want a rolling average over three minutes, not five minutes. They're getting better at letting the user customize this.

For how long have I used the solution?

Three to five years.

What do I think about the stability of the solution?

When I started using it years ago, it had stability problems. I remember, specifically, we ran everything in Docker containers. There were some problems getting it into a Docker container with very specific memory limits. We couldn't nail down exactly what the limits and the application needed. Once we did that, we were good. However, it was tricky to get the limit in the first place.

What do I think about the scalability of the solution?

It has always scaled for us. Cost scales up too, but that is not necessarily a bad thing. It's reasonable for what they're providing. I haven't had any concerns about scaling.

We use between a 100 to 500 servers at any given point in time.

How is customer service and technical support?

For the most part, the technical support is pretty good. Every now and again, you will get stuck with a support rep who could have better training, but in general, they are very good and responsive. They're willing to talk about new features, etc.

How was the initial setup?

The integration and configuration processes have been very smooth because everything is very well-documented. The documentation is phenomenal. 

What was our ROI?

We can see trends a lot easier than if we didn't have the solution. The management can see the changes which are being made, whether it being performance or in the number of hosts that went down. We recently made internal improvements to some of our internal APIs, so we reduced the number of servers that we needed. So, you could see that the load on the system went down and the number of servers went down. Thus, it was easy to visualize.

What's my experience with pricing, setup cost, and licensing?

Pricing and licensing are reasonable for what they give you. You get the first five hosts free, which is fun to play around with. Then it's about four dollars a month per host, which is very affordable for what you get out of it. We have a lot of hosts that we put a lot of custom metrics into, and every host gives you an allowance for the number of custom metrics. We have not had a problem with it.

Which other solutions did I evaluate?

My company now is pretty good at looking at alternatives. Also, I evaluated alternative solutions at my last company. 

There are some other competitors. For example, I know one of them started doing metrics and their licensing is very cheap because the metric size is very small and it's per megabyte. They charge you per storage, and it's very small. However, the interface and integrations aren't there. and there are some other competitors, 

The other thing is granularity. Datadog gives you one second granularity for a year. Whereas, some of the competitors would roll up, so after about a week you don't have one second, you have five seconds. Then, after a month, you don't have five seconds, you have a minute. So, you start to lose the granularity, whether it be that it averages it or maxes it, you start to lose the ability to see incidents historically, which is super valuable. If we have an incident, which we think we've seen this before, and want to look back historically, we can zoom right in and see in the database where it peaked.

What other advice do I have?

Give Datadog a try. It's the leader in this space. 

I have only used the AWS version of the product.

They have a thing for the color purple, but it is all good.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
reviewer2045070 - PeerSpot reviewer
Software Engineering Manager at a healthcare company with 501-1,000 employees
Real User
Great CI visibility, logging, and monitoring
Pros and Cons
  • "Datadog helps us detect issues early on and helps in troubleshooting."
  • "We would really like to see more from the Service Catalog."

What is our primary use case?

We mainly use the product to monitor our infrastructure and apps. It is the go-to tool when we want to check that things are running properly. We use Datadog synthetic monitors to ensure our app works across different locations in the United States. 

We also have set up Datadog monitors to send alerts if things stop working as expected. 

We use Continuous Integration Pipeline visibility to make sure our developers are not being blocked by infrastructure and other things that might be out of their control.

How has it helped my organization?

Datadog helps us detect issues early on and helps in troubleshooting. Creating Service Level Objectives and defining monitors is helping us to stay on top of potential issues that might affect our users. 

We take advantage of Application Performance Monitoring to ensure our applications are working as expected, and our users can get the healthcare they need at a price they can afford. 

Synthetic monitoring also helps us in testing our application in different browsers.

What is most valuable?

The most valuable aspects of the solution include: 

CI visibility, which helps us in making sure our CI systems are running efficiently and are not blocking our developers from releasing new software and fixing bugs.

Logs, which help us in debugging issues where we can search for logs and can make sure they are relevant to the issues we are looking at.

APM, which can help us to stay on top of our applications by giving us the confidence that our apps are running.

Monitoring. We use monitoring a lot to ensure we know about potential issues and fix them before they affect our customers.

What needs improvement?

Overall, we really like the quality and relevance of all of the Datadog products that are currently being used. 

The documentation is very well organized and is the go-to place for us to find answers to our questions. 

We would really like to see more from the Service Catalog. It is something that we are interested in. However, some might think it lacks some key features at this time. We will definitely keep our eye out for this and adopt it when all the features are implemented. 

We're really looking forward to all the great things DD will do.

For how long have I used the solution?

I've used the solution for three years.

What do I think about the stability of the solution?

The stability is great.

What do I think about the scalability of the solution?

The scalability is great.

How are customer service and support?

Technical support is great.

What about the implementation team?

We handled the initial setup in-house.

What's my experience with pricing, setup cost, and licensing?

I don't have any insights into pricing.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Download our free Datadog Report and get advice and tips from experienced pros sharing their opinions.
Updated: January 2025
Buyer's Guide
Download our free Datadog Report and get advice and tips from experienced pros sharing their opinions.