Try our new research platform with insights from 80,000+ expert users
Kenneth Dozier - PeerSpot reviewer
Associate Software Engineer at H&R Block, Inc.
User
Easy to use with good speed and helpful dashboards
Pros and Cons
  • "Watchdog is a favorite feature among a lot of the devs. It catches things they didn't even know were an issue."
  • "I would like to see the integration between PagerDuty and Datadog improved. The tags in Datadog don't match those in PagerDuty, and we have to make it work."

What is our primary use case?

We are using Datadog to improve our cloud monitoring and observability across our enterprise apps.  We have integrated a lot of different resources into Datadog, like Kubernetes, App Gateways, App Service Environments, App Service Plans, and other Web App resources. 

I will be using the monitoring and observability features of Datadog. Dashboards are used very heavily by teams and SREs. We really have seen that Datadog has already improved both our monitoring and our observability.

How has it helped my organization?

The ease and speed of which you can create a dashboard has been a huge improvement.  

The different types of monitors we can create have been huge, too. We can do so many different things with monitors that we couldn't do before with our alerts. 

Being able to click on a trace or log and drill down on it to see what happened has been great.  

Some have found the learning curve a bit steep. That said,they are coming around slowly. There is just a lot of information to learn how to navigate.

What is most valuable?

The different types of monitors have been very valuable. We have been able to make our alerts (monitors) more actionable than we were able to previously.  

Watchdog is a favorite feature among a lot of the devs. It catches things they didn't even know were an issue. 

RUM is another feature a lot of us are looking forward to seeing how it can help us improve our customer experience during tax season.  

We hope to enable the code review feature at some point to so we can see what code caused the issue.

What needs improvement?

I would like to see the integration between PagerDuty and Datadog improved.  The tags in Datadog don't match those in PagerDuty, and we have to make it work.  Also, I would like to see if the ability to replicate a KQL query in Datadog is made easier or better.  

I would like to see the alert communications to email or phones made better so we could hopefully move off PagerDuty and just use Datadog for that. 

There are also a lot of features that we haven't budgeted for yet and I would like for us to be able to use them in the future.

Buyer's Guide
Datadog
December 2024
Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: December 2024.
825,399 professionals have used our research since 2012.

For how long have I used the solution?

I've used the solution for about two years.

Which deployment model are you using for this solution?

Hybrid Cloud
Disclosure: My company has a business relationship with this vendor other than being a customer: H&R Block has recently signed with DataDog.
Flag as inappropriate
PeerSpot user
Sid Nigam - PeerSpot reviewer
Works at RAPDEV LLC
User
Top 20
Unified platform with customizable dashboards and AI-driven insights
Pros and Cons
  • "The infrastructure monitoring capabilities, especially for our Kubernetes clusters, have helped us optimize resource allocation and reduce costs."
  • "We'd like to see more advanced incident management capabilities integrated directly into the platform."

What is our primary use case?

Our primary use case for this solution is comprehensive cloud monitoring across our entire infrastructure and application stack. 

We operate in a multi-cloud environment, utilizing services from AWS, Azure, and Google Cloud Platform. 

Our applications are predominantly containerized and run on Kubernetes clusters. We have a microservices architecture with dozens of services communicating via REST APIs and message queues. 

The solution helps us monitor the performance, availability, and resource utilization of our cloud resources, databases, application servers, and front-end applications. 

It's essential for maintaining high availability, optimizing costs, and ensuring a smooth user experience for our global customer base. We particularly rely on it for real-time monitoring, alerting, and troubleshooting of production issues.

How has it helped my organization?

Datadog has significantly improved our organization by providing us with great visibility across the entire application stack. This enhanced observability has allowed us to detect and resolve issues faster, often before they impact our end-users. 

The unified platform has streamlined our monitoring processes, replacing several disparate tools we previously used. This consolidation has improved team collaboration and reduced context-switching for our DevOps engineers. 

The customizable dashboards have made it easier to share relevant metrics with different stakeholders, from developers to C-level executives. We've seen a marked decrease in our mean time to resolution (MTTR) for incidents, and the historical data has been invaluable for capacity planning and performance optimization. 

Additionally, the AI-driven insights have helped us proactively identify potential issues and optimize our infrastructure costs.

What is most valuable?

We've found the Application Performance Monitoring (APM) feature to be the most valuable, as it provides great visibility on trace-level data. This granular insight allows us to pinpoint performance bottlenecks and optimize our code more effectively. 

The distributed tracing capability has been particularly useful in our microservices environment, helping us understand the flow of requests across different services and identify latency issues. 

Additionally, the log management and analytics features have greatly improved our ability to troubleshoot issues by correlating logs with metrics and traces. 

The infrastructure monitoring capabilities, especially for our Kubernetes clusters, have helped us optimize resource allocation and reduce costs.

What needs improvement?

While Datadog is an excellent monitoring solution, it could be improved by building more features to replace alerting apps like OpsGenie and PagerDuty. Specifically, we'd like to see more advanced incident management capabilities integrated directly into the platform. This could include features like sophisticated on-call scheduling, escalation policies, and incident response workflows. 

Additionally, we'd appreciate more customizable machine learning-driven anomaly detection to help us identify unusual patterns more accurately. Improved support for serverless architectures, particularly for monitoring and tracing AWS Lambda functions, would be beneficial. 

Enhanced security monitoring and threat detection capabilities would also be valuable, potentially reducing our reliance on separate security information and event management (SIEM) tools.

For how long have I used the solution?

I've used the solution for two years.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
Buyer's Guide
Datadog
December 2024
Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: December 2024.
825,399 professionals have used our research since 2012.
reviewer9816413 - PeerSpot reviewer
Engineering Manager at Video Blocks
User
Top 20
Easy, more reliable, and transparent monitoring
Pros and Cons
  • "Monitors have also been very valuable when setting up our on-call processes. It makes it easy to set up and adjust alerting to keep our teams aware of anything going wrong."
  • "One thing to improve would be making it easier to see common patterns across traces."

What is our primary use case?

We use the solution to monitor and investigate issues with production services at work. We're periodically reviewing the service catalog view for the various applications and I use it to identify any anomalies with service metrics, any changes in user behavior evident via API calls, and/or spikes in errors.  

We use monitors to trigger alerts for on-call engineers to act upon. The monitors have set thresholds for request latency, error rates, and throughput. 

We also use automated rules to block bad actors based on request volume or patterns.

How has it helped my organization?

Datadog has made setting up monitors easier, more reliable, and more transparent. This has helped standardize our on-call process and set all of our on-call engineers up for success.  

It has also standardized the way we evaluate issues with our applications by encouraging all teams to use the service catalog.  

It makes it easier for our platforms and QA teams to get other engineering teams up to speed with managing their own applications' performance. 

Overall, Datadog has been very helpful for us.

What is most valuable?

The service catalog view is very helpful for periodic reviews of our application. It has also standardized the way we evaluate issues with our applications.  Having one page with an easy-to-scan view of app metrics, error patterns, package vulnerabilities, etc., is very helpful and reduces friction for our full-stack engineers.

Monitors have also been very valuable when setting up our on-call processes. It makes it easy to set up and adjust alerting to keep our teams aware of anything going wrong.

What needs improvement?

Datadog is great overall. One thing to improve would be making it easier to see common patterns across traces. I sometimes end up in a trace but have a hard time finding other common features about the error/requests that are similar to that trace. This could be easier to get to; however, in that case, it's actually an education issue.  

Another thing that could be improved is the service list page sometimes refreshes slowly, and I accidentally click the wrong environment since the sort changes late.

For how long have I used the solution?

I've used the solution for about a year.

What do I think about the stability of the solution?

It is very stable. I have not seen any issues with Datadog.

What do I think about the scalability of the solution?

It seems very scalable.

How are customer service and support?

I've had no specific experience with technical support.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

We used Honeycomb before. We switched since Datadog offered more tooling.

How was the initial setup?

Each application has been easy to instrument.

What about the implementation team?

We implemented the solution in-house.

What was our ROI?

Engineers save an unquantifiable amount of time by having one standard view for all applications and monitors.

What's my experience with pricing, setup cost, and licensing?

I am not exposed to this aspect of Datadog.

Which other solutions did I evaluate?

We did not evaluate other options. 

Which deployment model are you using for this solution?

Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
Dmitri Panfilov - PeerSpot reviewer
Software Engineer at Redfin Corp
User
Top 20
Easy dashboard creation and alarm monitoring with a good ROI
Pros and Cons
  • "The ease of dashboard creation and alarm monitoring has helped us not only stay competitive but be industry leaders in performance."
  • "The product can be improved by allowing the grouping of APIs to add variables. That way, any API with a unique ID could be grouped together."

What is our primary use case?

We use the solution to monitor production service uptime/downtime, latency, and log storage. 

Our entire monitoring infrastructure runs off Datadog, so all our alarms are configured with it. We also use it for tracing API performance; what are the biggest regression points. 

Finally we use it to compare performance on SEO metrics vs competitors. This is a primary use case as SEO dictates our position from google traffic which is a large portion of our customer view generation so it is a vital part of the business we rely on datadog for.

How has it helped my organization?

The product improved the organization primarily by providing consistent data with virtually zero downtime. This was a problem we had with an old provider. It also made it easy to transition an otherwise massive migration involving hundreds of alarms. 

The training provided was crucial, along with having a dedicated team that can forward our requests to and from Datadog efficiently. Without that, we may have never transitioned to Datadog in the first place since it is always hard to lead a migration for an entire company.

What is most valuable?

The API tracing has been massive for debugging latency regressions and how to improve the performance of our least performant APIs. Through tracing, we managed to find the slowest step of an API, improve its latency, and iterate on the process until we had our desired timings. This is important for improving our SEO as LCP, INP are directly taking from the numbers we see on Datadog for our API timings. 

The ease of dashboard creation and alarm monitoring has helped us not only stay competitive but be industry leaders in performance.

What needs improvement?

The product can be improved by allowing the grouping of APIs to add variables. That way, any API with a unique ID could be grouped together. 

Furthermore, SEO monitoring has been crucial for us but also a difficult part to set up as comparing alarms between us and competitors is a tough feat. Data is not always consistent so we have been toying and experimenting with removing the noise of datadog but its been taking a while. 

Finally, Datadog should have a feature that reports stale alarms based on activity.

For how long have I used the solution?

I've used the solution for six months.

What do I think about the stability of the solution?

Its very stable and we have not experienced an issue with downtime on Datadog.

What do I think about the scalability of the solution?

Datadog works well for scalability as volume has not seemed to slow.

How are customer service and support?

We haven't talked to the support team. 

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We switched to Datadog as we used to have a provider that had very inconsistent logging. Our alarms would often not fire since our services were not working since the provider had a logging problem.

How was the initial setup?

The initial setup was somewhat complex due to the built-in monitoring with services. This is not always super comprehensive and has to be studied as opposed to other metrics platforms that just service all your endpoints, which you can trace them with Grafana.

What about the implementation team?

We implemented the solution through an in-house team.

What was our ROI?

The ROI is good.

What's my experience with pricing, setup cost, and licensing?

Users must try to understand the way Datadog alarms work off the bat so that they can minimize the requirements for expensive features like custom metrics. 

It can sometimes be tempting to use them; however, it is not always necessary as you migrate to Datalog, as they are a provider that treats alarms somewhat differently than you may be used to.

Which other solutions did I evaluate?

We have evaluated New Relic, Grafana, Splunk, and many more in our quest to find the best monitoring provider.

Which deployment model are you using for this solution?

Hybrid Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
Victor Chen1 - PeerSpot reviewer
Software Engineer at Zip
Vendor
Good for log ingestion and analyzing logs with easy searchability of data
Pros and Cons
  • "The feature I've found most valuable is the log search feature."
  • "More helpful log search keywords/tips would be helpful in improving Datadog's log dashboard."

What is our primary use case?

We use Datadog as our main log ingestion source, and Datadog is one of the first places we go to for analyzing logs. 

This is especially true for cases of debugging, monitoring, and alerting on errors and incidents, as we use traffic logs from K8s, Amazon Web Services, and many other services at our company to Datadog. In addition, many products and teams at our company have dashboards for monitoring statistics (sometimes based on these logs directly, other times we set queries for these metrics) to alert us if there are any errors or health issues.

How has it helped my organization?

Overall, at my company, Datadog has made it easy to search for and look up logs at an impressively quick search rate over a large amount of logs. 

It seamlessly allows you to set up monitoring and alerting directly from log queries which is convenient and helps for a good user experience, and while there is a bit of a learning curve, given enough time a majority of my company now uses Datadog as the first place to check when there are errors or bugs. 

However, the cost aspect of Datadog is tricky to gauge because it's related to usage, and thus, it is hard to tell the relative value of Datadog year to year.

What is most valuable?

The feature I've found most valuable is the log search feature. It's set up with our ingestion to be a quick one-stop shop, is reliable and quick, and seamlessly integrates into building custom monitors and alerts based on log volume and timeframes. 

As a result, it's easy to leverage this to triage bugs and errors, since we can pinpoint the logs around the time that they occur and get metadata/context around the issue. This is the main feature that I use the most in my workflow with Datadog to help debug and triage issues.

What needs improvement?

More helpful log search keywords/tips would be helpful in improving Datadog's log dashboard. I recently struggled a lot to parse text from raw line logs that didn't seem to match directly with facets. There should be smart searching capabilities. However, it's not intuitive to learn how to leverage them, and instead had to resort to a Python script to do some simple regex parsing (I was trying to parse "file:folder/*/*" from the logs and yet didn't seem to be able to do this in Datadog, maybe I'm just not familiar enough with the logs but didn't seem to easily find resources on how to do this either). 

For how long have I used the solution?

I've used the solution for 10 months.

What's my experience with pricing, setup cost, and licensing?

Beware that the cost will fluctuate (and it often only gets more expensive very quickly).

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
Head of Software at Emporia
User
Top 10
Great for web application log aggregation, performance tracing, and alerting
Pros and Cons
  • "Real user monitoring gives us invaluable insights into actual user experiences, helping us prioritize improvements where they matter most."
  • "I'd like to see an expansion of the Android and IOS apps to have a simplified CI/CD pipeline history view."

What is our primary use case?

Our primary use case is for custom and vendor-supplied web application log aggregation, performance tracing, and alerting. 

We run a mix of AWS EC2, Azure serverless, and colocated VMWare servers to support higher education web applications. We're managing a hybrid multi-cloud solution across hundreds of applications is always a challenge. 

Datadog agents are on each web host, and we have native integrations with GitHubAWS, and Azure to get all of our instrumentation and error data in one place for easy analysis and monitoring.

How has it helped my organization?

Through use of Datadog across all of our apps, we were able to consolidate a number of alerting and error-tracking apps. Datadog ties them all together in cohesive dashboards. Whether the app is vendor supplied or we built it ourselves, the depth of tracing, profiling, and hooking into logs is all obtainable and tunable. Both legacy .NET Framework and Windows Event Viewer and cutting edge .NET Core with streaming logs all work. The breath of coverage for any app type or situation is really incredible. It feels like there's nothing we can't monitor.

What is most valuable?

When it comes to Datadog, several features have proven particularly valuable. 

The centralized pipeline tracking and error logging provide a comprehensive view of our development and deployment processes, making it much easier to identify and resolve issues quickly. 

Synthetic testing has been a game-changer, allowing us to catch potential problems before they impact real users. 

Real user monitoring gives us invaluable insights into actual user experiences, helping us prioritize improvements where they matter most. And the ability to create custom dashboards has been incredibly useful, allowing us to visualize key metrics and KPIs in a way that makes sense for different teams and stakeholders. 

Together, these features form a powerful toolkit that helps us maintain high performance and reliability across our applications and infrastructure, ultimately leading to better user satisfaction and more efficient operations.

What needs improvement?

I'd like to see an expansion of the Android and IOS apps to have a simplified CI/CD pipeline history view. 

I like the idea of monitoring on the go, yet it seems the options are still a bit limited out of the box. 

While the documentation is very good considering all the frameworks and technology Datadog covers, there are areas - specifically .NET Profiling and Tracing of IIS hosted apps - that need a lot of focus to pick up on the key details needed. 

In some cases the screenshots don't match the text as updates are made.

For how long have I used the solution?

I've been using the solution for about three years.

What do I think about the stability of the solution?

We have been impressed with the uptime. It offers clean and light resource usage of the agents.

What do I think about the scalability of the solution?

The solution scales well and is customizable. 

How are customer service and support?

Customer support is always helpful to help us tune our committed costs and alerting us when we start spending out of the on demand budget.

Which solution did I use previously and why did I switch?

We used a mix of a custom error email system, SolarWinds, UptimeRobot, and GitHub actions. We switched to find one platform that could give deep app visibility regardless of whether it is Linux or Windows or Container, cloud or on-prem hosted.

How was the initial setup?

The implementation is generally simple. .NET Profiling of IIS and aligning logs to traces and profiles was a challenge.

What about the implementation team?

We implemented the setup in-house. 

What was our ROI?

We've witnessed significant time saved by the development team assessing bugs and performance issues.

What's my experience with pricing, setup cost, and licensing?

Set up live trials to asses cost scaling. Small decisions around how monitors are used can have big impacts on cost scaling. 

Which other solutions did I evaluate?

NewRelic was considered. LogicMonitor was chosen over Datadog for our network and campus server management use cases.

What other advice do I have?

We're excited to dig further into the new offerings around LLM and continue to grow our footprint in Datadog. 

Which deployment model are you using for this solution?

Hybrid Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
Head of Software at Emporia
User
Top 10
Good centralized pipeline tracking and error logging with very good performance
Pros and Cons
  • "Real user monitoring gives us invaluable insights into actual user experiences, helping us prioritize improvements where they matter most."
  • "In some cases the screenshots don't match the text as updates are made."

What is our primary use case?

Our primary use case is custom and vendor-supplied web application log aggregation, performance tracing and alerting. 

We run a mix of AWS EC2, Azure serverless, and colocated VMWare servers to support higher education web applications. 

Managing a hybrid multi-cloud solution across hundreds of applications is always a challenge. 

Datadog agents on each web host and native integrations with GitHubAWS, and Azure get all of our instrumentation and error data in one place for easy analysis and monitoring.

How has it helped my organization?

Using Datadog across all of our apps, we were able to consolidate a number of alerting and error-tracking apps, and Datadog ties them all together in cohesive dashboards. 

Whether the app is vendor-supplied or we built it ourselves, the depth of tracing, profiling, and hooking into logs is all obtainable and tunable. Both legacy .NET Framework and Windows Event Viewer and cutting-edge .NET Core with streaming logs all work. 

The breadth of coverage for any app type or situation is really incredible. It feels like there's nothing we can't monitor.

What is most valuable?

When it comes to Datadog, several features have proven particularly valuable. For example, the centralized pipeline tracking and error logging provide a comprehensive view of our development and deployment processes, making it much easier to identify and resolve issues quickly. 

Synthetic testing has been a game-changer, allowing us to catch potential problems before they impact real users. 

Real user monitoring gives us invaluable insights into actual user experiences, helping us prioritize improvements where they matter most. And the ability to create custom dashboards has been incredibly useful, allowing us to visualize key metrics and KPIs in a way that makes sense for different teams and stakeholders. 

Together, these features form a powerful toolkit that helps us maintain high performance and reliability across our applications and infrastructure, ultimately leading to better user satisfaction and more efficient operations.

What needs improvement?

They need an expansion of the Android and IOS apps to provide a simplified CI/CD pipeline history view. 

I like the idea of monitoring on the go. That said, it seems the options are still a bit limited out of the box. 

While the documentation is very good considering all the frameworks and technology Datadog covers, there are areas - specifically .NET Profiling and Tracing of IIS hosted apps - that need a lot of focus to pick up on the key details needed. 

In some cases the screenshots don't match the text as updates are made. I spent longer than I should figuring out how to correlate logs to traces, mostly related to environmental variables.

For how long have I used the solution?

I've used the solution for about three years.

What do I think about the stability of the solution?

We have been impressed with the uptime and clean and light resource usage of the agents.

What do I think about the scalability of the solution?

The solution has been very scalable and very customizable.

How are customer service and support?

Support is always helpful to help us tune our committed costs and alert us when we start spending out of the on-demand budget.

Which solution did I use previously and why did I switch?

We used a mix of a custom error email system, SolarWinds, UptimeRobot, and GitHub actions. We switched to find one platform that could give deep app visibility regardless of Linux or Windows or Container, cloud or on-prem hosted.

How was the initial setup?

The implementation is generally simple. That said, .NET Profiling of IIS and aligning logs to traces and profiles was a challenge.

What about the implementation team?

The solution was implemented in-house. 

What was our ROI?

Our ROI has been significant time saved by the development team assessing bugs and performance issues.

What's my experience with pricing, setup cost, and licensing?

Set up live trials to asses cost scaling. Small decisions around how monitors are used can impact cost scaling. 

Which other solutions did I evaluate?

NewRelic was considered. LogicMonitor was chosen over Datadog for our network and campus server management use cases.

What other advice do I have?

We are excited to explore the new offerings around LLM further and continue to expand our presence in Datadog. 

Which deployment model are you using for this solution?

Hybrid Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
Neil Elver - PeerSpot reviewer
Application Development Team Lead at TCS EDUCATION SYSTEM
User
Top 10
Good synthetic testing, centralized pipeline tracking and error logging
Pros and Cons
  • "Synthetic testing has been a game-changer, allowing us to catch potential problems before they impact real users."
  • "I'd like to see an expansion of the Android and IOS apps to have a simplified CI/CD pipeline history view."

What is our primary use case?

Our primary use case is custom and vendor-supplied web application log aggregation, performance tracing and alerting. 

We run a mix of AWS EC2, Azure serverless, and colocated VMWare servers to support higher education web applications. 

Managing a hybrid multi-cloud solution across hundreds of applications is always a challenge. Datadog agents on each web host and native integrations with GitHubAWS, and Azure get all of our instrumentation and error data in one place for easy analysis and monitoring.

How has it helped my organization?

Through the use of Datadog across all of our apps, we were able to consolidate a number of alerting and error-tracking apps, and Datadog ties them all together in cohesive dashboards. Whether the app is vendor-supplied or we built it ourselves, the depth of tracing, profiling, and hooking into logs is all obtainable and tunable. Both legacy .NET Framework and Windows Event Viewer and cutting-edge .NET Core with streaming logs all work. The breadth of coverage for any app type or situation is really incredible. It feels like there's nothing we can't monitor.

What is most valuable?

When it comes to Datadog, several features have proven particularly valuable. 

The centralized pipeline tracking and error logging provide a comprehensive view of our development and deployment processes, making it much easier to identify and resolve issues quickly. 

Synthetic testing has been a game-changer, allowing us to catch potential problems before they impact real users. Real user monitoring gives us invaluable insights into actual user experiences, helping us prioritize improvements where they matter most. And the ability to create custom dashboards has been incredibly useful, allowing us to visualize key metrics and KPIs in a way that makes sense for different teams and stakeholders. 

Together, these features form a powerful toolkit that helps us maintain high performance and reliability across our applications and infrastructure, ultimately leading to better user satisfaction and more efficient operations.

What needs improvement?

I'd like to see an expansion of the Android and IOS apps to have a simplified CI/CD pipeline history view. I like the idea of monitoring on the go, however, it seems the options are still a bit limited out of the box. 

While the documentation is very good considering all the frameworks and technology Datadog covers, there are areas - specifically .NET Profiling and Tracing of IIS-hosted apps - that need a lot of focus to pick up on the key details needed. In some cases the screenshots don't match the text as updates are made. I feel I spent longer than I should figuring out how to correlate logs to traces, mostly related to environmental variables.

For how long have I used the solution?

I've used the solution for about three years.

What do I think about the stability of the solution?

We have been impressed with the uptime and clean and light resource usage of the agents.

What do I think about the scalability of the solution?

The solution was very scalable and very customizable.

How are customer service and support?

Sales service is always helpful in tuning our committed costs and alerting us when we start spending outside the on-demand budget.

Which solution did I use previously and why did I switch?

We used a mix of a custom error email system, SolarWinds, UptimeRobot, and GitHub actions. We switched to find one platform that could give deep app visibility regardless of Linux, Windows, Container, cloud or on-prem hosted.

How was the initial setup?

The setup is generally simple. That said, .NET Profiling of IIS and aligning logs to traces and profiles was a challenge.

What about the implementation team?

The solution was iImplemented in-house. 

What was our ROI?

I'd count our ROI as significant time saved by the development team assessing bugs and performance issues.

What's my experience with pricing, setup cost, and licensing?

It's a good idea to set up live trials to asses cost scaling. Small decisions around how monitors are used can have big impacts on cost scaling. 

Which other solutions did I evaluate?

NewRelic was considered. LogicMonitor was chosen over Datadog for our network and campus server management use cases.

What other advice do I have?

We are excited to dig further into the new offerings around LLM and continue to grow our footprint in Datadog. 

Which deployment model are you using for this solution?

Hybrid Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
Buyer's Guide
Download our free Datadog Report and get advice and tips from experienced pros sharing their opinions.
Updated: December 2024
Buyer's Guide
Download our free Datadog Report and get advice and tips from experienced pros sharing their opinions.