Try our new research platform with insights from 80,000+ expert users
James Baird - PeerSpot reviewer
Infrastructure Engineer at a tech services company with 11-50 employees
Real User
Easy to use, simple to set up, and allows for easy visibility
Pros and Cons
  • "Datadog has so far been a breeze to use and set up."
  • "One thing we have run into is that it is so easy to add monitoring that we turn on things without really understanding the costs."

What is our primary use case?

We currently use it for log aggregation and SEIM. We send logs from our AWS account (particularly our Cloudtrail and S3 logs) and use them to give us security signals. 

This has helped with our SOC2 certification process and has given us a window into our processes and the security holes in our system. 

We are also considering using the APM features to help with our development effort. We want to be able to profile all of our code and see what is going on with it.

How has it helped my organization?

It has allowed us to see into our systems with ease. We are a very small startup (Less than 30 people, and most of them are in sales and marketing). 

When it comes to managing systems, we just don't have time to do everything. However, Datadog has allowed us to do much more with fewer people and still sift through our data with ease. 

We hope to start using the APM feature set to extend this to our dev teams as well.

What is most valuable?

The ease of use is the primary aspect. I have used, at previous jobs, the ELK stack and Splunk for log management. Both of them were useful, yet required a lot of manual effort to get set up (and a lot of continuing effort to tweak. A simple monitoring solution turned into a full-time job! However, Datadog has so far been a breeze to use and set up. It looks at what I am sending it and figures out what it is almost by magic. Even the manual configuration makes sense and gives very fast and thorough results

What needs improvement?

One thing we have run into is that it is so easy to add monitoring that we turn on things without really understanding the costs. 

I would like a way to show a continuous indication of what my setup will cost on a daily or weekly basis.

Buyer's Guide
Datadog
November 2024
Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: November 2024.
814,763 professionals have used our research since 2012.

For how long have I used the solution?

I've used the solution for six months.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Ramon Snir - PeerSpot reviewer
CTO at a tech vendor with 1-10 employees
Real User
Increases delivery velocity with les manual testing and good integrations
Pros and Cons
  • "Since we integrated Datadog, we have had increased confidence in the quality of our service, and we had an easier time increasing our delivery velocity."
  • "Since the Datadog platform has so many separate features, solving so many use cases, there are often inconsistencies in feature availability and interoperability between products."

What is our primary use case?

We use Datadog for three main use cases, including:

  • Infrastructure and application monitoring. It is ensuring that our services are available and performant at all times. This allows us to proactively address incidents and outages without customers contacting us. This includes monitoring of cloud resources (databases, load balancers, CPU usage, etc.), high-level application monitoring (response times, failure rates, etc.), and low-level application monitoring (business-oriented metrics and functional exceptions to customer experience.
  • Analyzing application behavior, especially around performance. We often use Datadog's application performance monitoring on non-production environments to evaluate the impact of newly introduced features and gain confidence in changes.
  • End-to-end regression testing for APIs and browser-based experiences. Using Datadog's synthetic testing checks periodically that the system behaves in the exact correct way. This is often used as a canary to detect issues even before users reach them organically.

How has it helped my organization?

Since we integrated Datadog, we have had increased confidence in the quality of our service, and we had an easier time increasing our delivery velocity. 

We have seen time after time that the monitors we have carefully created based on all ingested data are detecting issues quickly and accurately. 

This means we allow ourselves to manually test things less frequently. We have also had an easier time investigating application errors and slowness using Datadog's APM and log explorer products which allow us to introspect any part of the system, in its execution context.

What is most valuable?

The most valuable features include:

  • Integrated observability data ingestions: All data that Datadog collects is connected. This allows easily connected logs with failed requests, and slow database questions with services and requests.
  • Broad integrations allow us to monitor our entire production environment in a single place, not just cloud resources. Since all parts stream metrics, logs, and events to Datadog, we can have unified dashboards and manage monitors and incidents all from the same page.
  • A high level of configuration. We can configure and modify many parts, from how data is collected from our applications to how Datadog parses and visualizes it. This means that we always get the best experience, and we don't need to find ten different products that do small things well or settle on one product that does everything badly.

What needs improvement?

Since the Datadog platform has so many separate features, solving so many use cases, there are often inconsistencies in feature availability and interoperability between products. 

Older, more mature products tend to be complete (many features, customization, broad integrations, etc.), while newer products will often be at a "just above minimum viable product" phase for a long time, doing what's intended yet missing valuable customizations and integrations.

For how long have I used the solution?

We've used the solution for 12 months.

What do I think about the scalability of the solution?

The solution scales very well on technical aspects, being able to ingest large quantities of data from many services. However, the pricing often doesn't scale naturally, and effort has to be put in to keep ongoing costs at a reasonable amount.

How are customer service and support?

Customer service and support are generally very high-quality. In most cases, they reply very quickly and offer well-researched and relevant responses. This is contrasted with many vendors who take a long time to reply and send links to documentation instead of understanding the problem.

However, we had cases where support took several weeks to reply to a complicated request and sometimes eventually responded that the issue cannot be resolved. These are rare edge-case occurrences.

How would you rate customer service and support?

Positive

How was the initial setup?

A large part of the initial setup was straightforward. We were able to collect about 80% of the relevant and 90% of the meaningful insights from just a couple of hours of connecting the AWS integration and the Datadog APM agent. 

Getting it to 100% and configuring and customizing things to our unique situation, took about two weeks. Datadog's documentation and support team were extremely helpful during both phases.

What about the implementation team?

We handled the setup in-house.

What was our ROI?

From the number of outages stopped or shortened (which lead to lost revenue from non-renewals) and the number of hours saved on investigations (which correlates to engineering salaries), I estimate that the ROI of the implementation time and monthly charges to be between 10x and 20x.

What other advice do I have?

We use the solution as a SaaS deployment.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Datadog
November 2024
Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: November 2024.
814,763 professionals have used our research since 2012.
Software Engineering Manager at a healthcare company with 501-1,000 employees
Real User
Top 20
Great CI visibility, logging, and monitoring
Pros and Cons
  • "Datadog helps us detect issues early on and helps in troubleshooting."
  • "We would really like to see more from the Service Catalog."

What is our primary use case?

We mainly use the product to monitor our infrastructure and apps. It is the go-to tool when we want to check that things are running properly. We use Datadog synthetic monitors to ensure our app works across different locations in the United States. 

We also have set up Datadog monitors to send alerts if things stop working as expected. 

We use Continuous Integration Pipeline visibility to make sure our developers are not being blocked by infrastructure and other things that might be out of their control.

How has it helped my organization?

Datadog helps us detect issues early on and helps in troubleshooting. Creating Service Level Objectives and defining monitors is helping us to stay on top of potential issues that might affect our users. 

We take advantage of Application Performance Monitoring to ensure our applications are working as expected, and our users can get the healthcare they need at a price they can afford. 

Synthetic monitoring also helps us in testing our application in different browsers.

What is most valuable?

The most valuable aspects of the solution include: 

CI visibility, which helps us in making sure our CI systems are running efficiently and are not blocking our developers from releasing new software and fixing bugs.

Logs, which help us in debugging issues where we can search for logs and can make sure they are relevant to the issues we are looking at.

APM, which can help us to stay on top of our applications by giving us the confidence that our apps are running.

Monitoring. We use monitoring a lot to ensure we know about potential issues and fix them before they affect our customers.

What needs improvement?

Overall, we really like the quality and relevance of all of the Datadog products that are currently being used. 

The documentation is very well organized and is the go-to place for us to find answers to our questions. 

We would really like to see more from the Service Catalog. It is something that we are interested in. However, some might think it lacks some key features at this time. We will definitely keep our eye out for this and adopt it when all the features are implemented. 

We're really looking forward to all the great things DD will do.

For how long have I used the solution?

I've used the solution for three years.

What do I think about the stability of the solution?

The stability is great.

What do I think about the scalability of the solution?

The scalability is great.

How are customer service and support?

Technical support is great.

What about the implementation team?

We handled the initial setup in-house.

What's my experience with pricing, setup cost, and licensing?

I don't have any insights into pricing.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Senior Software Engineer at a transportation company with 51-200 employees
Real User
Top 20
Good dashboard, excellent monitoring, and easy to expand
Pros and Cons
  • "Datadog has helped us a ton by allowing us to set up a multitude of easily configurable alarms across our tech stack and infrastructure."
  • "I found the documentation can sometimes be confusing."

What is our primary use case?

We primarily use Datadog for alerts. If we're running out of database connections or CPU credits we want to find out in Slack. Datadog provides nice features for that.

Secondarily, we use Datadog for analyzing historical trends and forecasting potential issues.

I'm trying to learn how to add in Continuous Profiler in our primary backend servers and set up Synthetic Tests for monitoring our front end.

Everything is mostly on AWS, and the Datadog integrations help a ton.

How has it helped my organization?

Datadog has helped us a ton by allowing us to set up a multitude of easily configurable alarms across our tech stack and infrastructure. It doesn't matter if it's in AWS Lambda or a Docker container in AWS EC2, Datadog's intuitive interface makes alarms incredibly easy to configure, reducing our resolution time for incidents.

A lot of the value comes from how frictionless the integrations are. Adding in a Datadog agent or flipping a switch on the Datadog UI to start streaming Lambda data makes the product so incredibly appealing for my company.

What is most valuable?

The monitoring feature has been the most valuable.

I really like the dashboard. Monitoring has a straightforward tie-in to business value at my company (i.e. declaring incidents, etc). Things like having a dashboard and APM make my job easier. That said DevX is a little bit of a harder sell to executives in my company.

The dashboard feature makes it so easy to inspect multiple metrics at once across services. It's truly been a lifesaver when I'm personally trying to understand why performance degradation is happening.

What needs improvement?

I found the documentation can sometimes be confusing. I tried configuring APM for some of our Python containers, and I had to cross-reference multiple blog posts and the official documentation to figure out which Datadog-agent to use. If I needed a ddtrace trace, what environment variables I should set, etc. 

Furthermore, to generate my own traces, I wasn't aware that ddtrace adds its own "monkey patching," which led to headaches with respect to configuring the service for RabbitMQ.

A more unified and up-to-date documentation suite would be greatly appreciated.

For how long have I used the solution?

I've used the solution for about two years.

What do I think about the stability of the solution?

I don't recall seeing an incident from Datadog in the past couple of years and that's been wonderful.

What do I think about the scalability of the solution?

The solution is incredibly scalable! To be fair, our data throughput to Datadog isn't super huge, however, we have never seen issues as it scaled to handle more of our data.

Which solution did I use previously and why did I switch?

We used to use AWS Cloudwatch for a lot of our monitoring needs. That said, the interface felt clunky, confusing, and limited.

What was our ROI?

We don't have hard numbers on ROI. That said, overall, it has been a wonderful addition to our tooling suite.

Which other solutions did I evaluate?

We also looked at Honeycomb and are currently using both in production.

Which deployment model are you using for this solution?

Private Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Senior Cloud Engineer at a comms service provider with 10,001+ employees
Real User
Good platform monitoring and great cost and performance optimization
Pros and Cons
  • "The observability pipelines are the most valuable aspect of the solution."
  • "Geo-data is also something very critical that we hope to see in the future."

What is our primary use case?

We use the solution primarily for platform monitoring for the services that are deployed in AWS. It gives a better way to monitor the services, including pods, cost, high availability, etc. This way, observability is ensured and also customer services are uninterrupted. 

Also, we host the data pipelines between the cloud and the on-prem for which Datadog is used to ensure better services. We report issues based on the metrics reported over it. 

How has it helped my organization?

Cost and performance optimization were the major enhancements for our organization. It gives us platform monitoring for the services that are deployed in AWS for a better way to monitor the services (pods, cost, high availability, etc.). With this product, we ensure that observability and also keep customer services uninterrupted. We host the data pipelines between the cloud and the on-prem. Datadog helps to ensure better services. We find we can report issues based on the metrics reported over it.

What is most valuable?

The observability pipelines are the most valuable aspect of the solution. 

Platform monitoring for the services that are deployed in AWS is helpful. It gives a better way to monitor the services. With Datadog, we ensure observability and maintain uninterrupted customer service. 

We can host the data pipelines between the cloud and the on-prem. Issues are easily reported.

The data streams are good. Data lineage is something that really helped in ensuring tracking of the data and metrics and also the volumes processed.

What needs improvement?

We'd like to see better transformers.

Live chat would be the best way to support us. 

Also, the features that we saw getting launched recently were something we expected and we're glad to see them coming.  

Geo-data is also something very critical that we hope to see in the future.

For how long have I used the solution?

I've used the solution for two or more years. 

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Software Engineer at Spring Health
User
Great dashboards and custom metrics with the ability to parse logs
Pros and Cons
  • "The dashboards are great."
  • "We need more advanced querying against logs."

What is our primary use case?

We share dashboards, set up alerts, and monitor everything that happens in our system. We use it in staging, features, production, and our load test environment. It is exceptionally helpful for making our engineering more data-driven. 

I came from a company that believes we should focus on being telemetry driven. Instilling this in a smaller, less mature engineering organization has been challenging. However, it is much easier while using Datadog.

What is most valuable?

The dashboards are great. They are an easy way to give visibility into what we need to watch with others who are not SMEs.

I enjoy the custom metrics. With this, we can take things that were once logs and then retain them longer.

We are able to parse logs. To be honest, this was only useful due to the fact that we had not yet set up the Datadog agent properly in PHP. Once we did this, the Datadog log parsing was no longer needed.

The ability to pin to a date and time is very helpful. This allows us to pinpoint exactly what was happening.

What needs improvement?

We need more advanced querying against logs. While most issues I have had here can be alleviated by way of sending better-formatted logs, it would be cool to do SQL-type queries against our data.

We need a way to see dashboard metadata. We launched a huge customer, and we saw more people using Datadog than ever across the entire organization, yet had no way to tell.

It would be ideal if we had some way to compare arbitrary date times more easily. We would love to use the Diff Graph command against some hard-coded value, for instance, against some known event.

For how long have I used the solution?

I've used the solution for eight months.

What do I think about the scalability of the solution?

The scalability is great!

Which solution did I use previously and why did I switch?

We previously used New Relic. I was not part of the decision-making team that made the switch.

What was our ROI?

The ROI is the speed at which we can debug live sites. It has been excellent. It's amazing how many incidents we can capture before customers notice.

Which other solutions did I evaluate?

We looked into New Relic and a home-brewed solution as potential other options.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
reviewer0962486 - PeerSpot reviewer
Head of Product Design at hackajob
User
Good alerts and detailed data but needs UI improvements
Pros and Cons
  • "Session recordings have been the most valuable to me as it helps me gain insights into user behaviour at scale."
  • "In terms of UI, everything is very small, which makes it quite difficult to navigate at times."

What is our primary use case?

I work in product design, and although we use Datadog for monitoring, etc, my use case is different as I mostly review and watch session recordings from users to gain insight into user feedback.

We watch multiple sessions per week to understand how users are using our product. From this data, we are able to hone in on specific problems that come up during the sessions. We then reach out to specific users to follow up with them via moderated testing sessions, which is very valuable for us.

How has it helped my organization?

Using Datadog has allowed us to review detailed interactions of users at a scale that leads us to make informed data-driven UX improvements as mentioned above.

Being able to pinpoint specific users via filtering is also very useful as it means when we have direct feedback from a specific user, we can follow up by watching their session back. 

The engineering team's use case for Datadog is for alerting, which is also very useful for us as it gives us visibility of how stable our platform is in various different lenses.

What is most valuable?

Session recordings have been the most valuable to me as it helps me gain insights into user behaviour at scale. By capturing real-time interactions, such as clicks, scrolls, and navigation paths, we can identify patterns and trends across a large user base. This helps us pinpoint usability issues, optimize the user experience, and improve the overall experience for our users. Analyzing these recordings enables us to make data-driven decisions that enhance both functionality and user satisfaction.

What needs improvement?

I'd like the ability to see more in-depth actions on user sessions, such as where there are specific problems and rather than having to watch numerous session recordings to understand where this happens to get alerts/notifications of specific areas that users are struggling with - such as rage clicks, etc.

In terms of UI, everything is very small, which makes it quite difficult to navigate at times, especially in terms of accessibility, so I'd love for there to be more attention on this.

For how long have I used the solution?

I've used the solution for over one year.

Which solution did I use previously and why did I switch?

We did not evaluate other options. 

What's my experience with pricing, setup cost, and licensing?

I wasn't part of the decision-making process during licensing.

Which other solutions did I evaluate?

I wasn't part of the decision-making process during the evaluation stage.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
SecOps Engineer at Ava Labs
User
Helpful support, with centralized pipeline tracking and error logging
Pros and Cons
  • "Real user monitoring gives us invaluable insights into actual user experiences, helping us prioritize improvements where they matter most."
  • "While the documentation is very good, there are areas that need a lot of focus to pick up on the key details."

What is our primary use case?

Our primary use case is custom and vendor-supplied web application log aggregation, performance tracing and alerting. 

How has it helped my organization?

Through the use of Datadog across all of our apps, we were able to consolidate a number of alerting and error-tracking apps, and Datadog ties them all together in cohesive dashboards. 

What is most valuable?

The centralized pipeline tracking and error logging provide a comprehensive view of our development and deployment processes, making it much easier to identify and resolve issues quickly. 

Synthetic testing is great, allowing us to catch potential problems before they impact real users. Real user monitoring gives us invaluable insights into actual user experiences, helping us prioritize improvements where they matter most. And the ability to create custom dashboards has been incredibly useful, allowing us to visualize key metrics and KPIs in a way that makes sense for different teams and stakeholders. 

What needs improvement?

While the documentation is very good, there are areas that need a lot of focus to pick up on the key details. In some cases the screenshots don't match the text when updates are made. 

I spent longer than I should trying to figure out how to correlate logs to traces, mostly related to environmental variables.

For how long have I used the solution?

I've used the solution for about three years.

What do I think about the stability of the solution?

We have been impressed with the uptime.

What do I think about the scalability of the solution?

It's scalable and customizable. 

How are customer service and support?

Support is helpful. They help us tune our committed costs and alert us when we start spending out of the on-demand budget.

Which solution did I use previously and why did I switch?

We used a mix of SolarWinds, UptimeRobot, and GitHub actions. We switched to find one platform that could give deep app visibility.

How was the initial setup?

Setup is generally simple. .NET Profiling of IIS and aligning logs to traces and profiles was a challenge.

What about the implementation team?

We implemented the solution in-house.

What was our ROI?

There has been significant time saved by the development team in terms of assessing bugs and performance issues.

What's my experience with pricing, setup cost, and licensing?

I'd advise others to set up live trials to asses cost scaling. Small decisions around how monitors are used can have big impacts on cost scaling. 

Which other solutions did I evaluate?

NewRelic was considered. LogicMonitor was chosen over Datadog for our network and campus server management use cases.

What other advice do I have?

We are excited to dig further into the new offerings around LLM and continue to grow our footprint in Datadog. 

Which deployment model are you using for this solution?

Hybrid Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
Buyer's Guide
Download our free Datadog Report and get advice and tips from experienced pros sharing their opinions.
Updated: November 2024
Buyer's Guide
Download our free Datadog Report and get advice and tips from experienced pros sharing their opinions.