What is your primary use case for Datadog?

Datadog is a comprehensive cloud monitoring platform designed to track performance, availability, and log aggregation for cloud resources like AWS, ECS, and Kubernetes. It offers robust tools for creating dashboards, observing user behavior, alerting, telemetry, security monitoring, and synthetic testing. Datadog supports full observability across cloud providers and environments, enabling troubleshooting, error detection, and performance analysis to maintain system reliability. It offers...

Download Datadog Report Read more

Related Q&As

Apr 9, 2024

What are the main differences between vRealize Log Insight and Datadog?

Feb 27, 2024

What are your thoughts about SAT DogQ (SAT - Software Automated Testing)?

score 0 · Answer 1 · 2025-01-07T18:03:00Z

We use Datadog for monitoring and observing all of our systems, which range in complexity from lightweight, user-facing serverless lambda functions with millions of daily calls to huge, monolithic internal applications that are essential to our core operations. The value we derive from Datadog stems from its ability to handle and parse a massive volume of incoming data from many different sources and tie it together into a single, informative view of reliability and performance across our architecture.

reviewer092526 Director of Engineering at Ordoro · Answer 2 · 2024-10-01T16:54:00Z

We use Datadog for monitoring the performance of our infrastructure across multiple types of hosts in multiple environments. We also use APM to monitor our applications in production. We have some Kubernetes clusters and multi-cloud hosts with Datadog agents installed. We have recently added RUM to monitoring our application from the user side, including replay sessions, and are hoping to use those to replace existing monitoring for errors and session replay for debugging issues in the application.

reviewer1974104 Software Engineering Manager at Finalsite · Answer 3 · 2024-09-19T13:36:00Z

Our primary use case is custom and vendor-supplied web application log aggregation, performance tracing and alerting. We run a mix of AWS EC2, Azure serverless, and colocated VMWare servers to support higher education web applications. Managing a hybrid multi-cloud solution across hundreds of applications is always a challenge. Datadog agents on each web host, and native integrations with GitHub, AWS, and Azure gets all of our instrumentation and error data in one place for easy analysis and monitoring.

reviewer9816413 Engineering Manager at Video Blocks · Answer 4 · 2024-09-19T02:08:00Z

We use the solution to monitor and investigate issues with production services at work. We're periodically reviewing the service catalog view for the various applications and I use it to identify any anomalies with service metrics, any changes in user behavior evident via API calls, and/or spikes in errors. We use monitors to trigger alerts for on-call engineers to act upon. The monitors have set thresholds for request latency, error rates, and throughput. We also use automated rules to block bad actors based on request volume or patterns.

reviewer902462 Senior Software Engineer at Clearstory.build · Answer 5 · 2024-09-18T23:03:00Z

Our primary use case for Datadog is to monitor, analyze, and optimize the performance and health of our applications and infrastructure. We leverage its logging, metrics, and tracing capabilities to pinpoint issues, track system performance, and improve overall reliability. Datadog’s ability to provide real-time insights and alerting on key metrics helps us quickly address issues, ensuring smooth operations. It’s integral for visibility across our microservices architecture and cloud environments.

score 0 · Answer 6 · 2024-09-18T20:43:00Z

The product monitors multiple systems, from customer interactions on our web applications down to the database and all layers in between. RUM, APM, logging, and infrastructure monitoring are all surfaced into single dashboards. We initially started with application logs and generated long-term business metrics out of critical logs. We have turned those metrics and logs into a collection of alerts integrated into our pager system. As we have evolved, we have also used APM and RUM data to trigger additional alerts.

ZJ Software Engineer at a computer software company with 201-500 employees · Answer 7 · 2024-09-18T19:24:00Z

Our primary use case for Datadog involves utilizing its dashboards, monitors, and alerts to monitor several key components of our infrastructure. We track the performance of AWS-managed Airflow pipelines, focusing on metrics like data freshness, data volume, pipeline success rates, and overall performance. In addition, we monitor Looker dashboard performance to ensure data is processed efficiently. Database performance is also closely tracked, allowing us to address any potential issues proactively. This setup provides comprehensive observability and ensures that our systems operate smoothly.

Franz Kettwig Head of Software at Emporia · Answer 8 · 2024-09-18T19:01:00Z

Our primary use case is custom and vendor-supplied web application log aggregation, performance tracing and alerting. We run a mix of AWS EC2, Azure serverless, and colocated VMWare servers to support higher education web applications. Managing a hybrid multi-cloud solution across hundreds of applications is always a challenge. Datadog agents on each web host and native integrations with GitHub, AWS, and Azure get all of our instrumentation and error data in one place for easy analysis and monitoring.

Ishmeet Kaur Software Engineer at Apple · Answer 9 · 2024-09-18T18:36:00Z

Our primary use case is custom and vendor-supplied web application log aggregation, performance tracing and alerting. We run a mix of AWS EC2, Azure serverless, and colocated VMWare servers to support higher education web applications. We're managing a hybrid multi-cloud solution across hundreds of applications, which is always a challenge. There are Datadog agents on each web host, and native integrations with GitHub, AWS, and Azure and that gets all of our instrumentation and error data in one place for easy analysis and monitoring.

Neil Elver Application Development Team Lead at TCS EDUCATION SYSTEM · Answer 10 · 2024-09-18T18:11:00Z

Our primary use case is custom and vendor-supplied web application log aggregation, performance tracing and alerting. We run a mix of AWS EC2, Azure serverless, and colocated VMWare servers to support higher education web applications. Managing a hybrid multi-cloud solution across hundreds of applications is always a challenge. Datadog agents on each web host and native integrations with GitHub, AWS, and Azure get all of our instrumentation and error data in one place for easy analysis and monitoring.

Dmitri Panfilov Software Engineer at Redfin Corp · Answer 11 · 2024-09-18T16:55:00Z

We use the solution to monitor production service uptime/downtime, latency, and log storage. Our entire monitoring infrastructure runs off Datadog, so all our alarms are configured with it. We also use it for tracing API performance; what are the biggest regression points. Finally we use it to compare performance on SEO metrics vs competitors. This is a primary use case as SEO dictates our position from google traffic which is a large portion of our customer view generation so it is a vital part of the business we rely on datadog for.

Mason Parry Data Engineer at Nursa · Answer 12 · 2024-09-18T16:26:00Z

We have several teams and several different projects, all working in tandem, so there are a lot of logs and monitoring that need to be done. We use Datadog mostly for alerting when things go down. We also have several dashboards to keep track of critical operations and to make sure things are running without issues. The Slack messaging is essential in our workflow in letting us know when an alert is triggered. I also appreciate all the graphs you can make, as it gives our team a good overview of how our services are doing.

Michael Johnston1 Senior Software Engineer at angel Studios · Answer 13 · 2024-09-18T15:57:00Z

We currently have an error monitor to monitor errors on our prod environment. Once we hit a certain threshold, we get an alert on Slack. This helps address issues the moment they happen before our users notice. We also utilize synthetic tests on many pages on our site. They're easy to set up and are great for pinpointing when a bug is shipped, but they may take down a less visited page that we aren't immediately aware of. It's a great extra check to make sure the code we ship is free of bugs.

Hoon Kang Full Stack Engineer at K HEALTH, INC · Answer 14 · 2024-09-18T15:31:00Z

Our company has a microservice architecture, with different teams in charge of different services. Also, it is a start, which means that we have to build fast and move very fast as well. So before we were properly using DD, we often had issues of things breaking, but without much information on where in our system the breaking happened. This was quite a big-time sync as teams were unfamiliar with other teams' codes, so they needed the help of other teams to debug. This slowed our building down a lot. So implementing dd traces fixed this

Sid Nigam Works at RAPDEV LLC · Answer 15 · 2024-09-18T15:15:00Z

Our primary use case for this solution is comprehensive cloud monitoring across our entire infrastructure and application stack. We operate in a multi-cloud environment, utilizing services from AWS, Azure, and Google Cloud Platform. Our applications are predominantly containerized and run on Kubernetes clusters. We have a microservices architecture with dozens of services communicating via REST APIs and message queues. The solution helps us monitor the performance, availability, and resource utilization of our cloud resources, databases, application servers, and front-end applications. It's essential for maintaining high availability, optimizing costs, and ensuring a smooth user experience for our global customer base. We particularly rely on it for real-time monitoring, alerting, and troubleshooting of production issues.

Tejaswini A Application Engineer at Discover Financial Services · Answer 16 · 2024-06-25T16:25:00Z

We have a tech stack including all backend services written in TS/Node (mostly) and as a full stack engineer, it is crucial to keep track of new and existing errors. Our logs have been consolidated in Datadog and are accessible for search and review, so the service has become a daily tool for my work. More recently, session replay has been adopted at my company, but I do not like it so much because the UI elements are not in their place, so it is very hard to see what the users on the web app are actually clicking on.

Akshay Manchalwar Technical Support Engineer at Cybage Software · Answer 17 · 2024-04-16T08:37:57Z

Akshay Manchalwar

Technical Support Engineer at Cybage Software

Real User

Top 5Leaderboard

Apr 16, 2024

Datadog is mainly used to set up alerts and thresholds to monitor real-time metrics and checks.

score 0 · Answer 18 · 2023-01-25T15:49:08Z

We use Datadog for monitoring to get the traces and logs of all our applications. Datadog provides dashboard and alert capabilities to identify if something is wrong with various teams. More than 200 users, mostly software engineers, work with Datadog.

reviewer2045067 Works · Answer 19 · 2022-12-06T21:07:00Z

Our primary use case would be using the dashboards and getting proper insights based on the dashboards. The monitoring, SLO, and SLA have been better and easier since we started using the Terraform infrastructure. APM has been easier as we had to enable it through the CronJob directly. Profiling has been made easier. We are able to get many insights into the code. Profiling provides really good insights right now. Logs are the most valuable and the best solution so far. Datadog can help solve any slow queries or database-related errors. The primary use case would be using the dashboards and getting proper insights based on the dashboards.

score 0 · Answer 20 · 2022-12-06T21:07:00Z

We mainly use the product to monitor our infrastructure and apps. It is the go-to tool when we want to check that things are running properly. We use Datadog synthetic monitors to ensure our app works across different locations in the United States. We also have set up Datadog monitors to send alerts if things stop working as expected. We use Continuous Integration Pipeline visibility to make sure our developers are not being blocked by infrastructure and other things that might be out of their control.

reviewer2045055 Sr. Software Engineer at Gladly · Answer 21 · 2022-12-06T20:57:00Z

reviewer2045055

Sr. Software Engineer at Gladly

Real User

Dec 6, 2022

Observability is a key use case, as is security.

reviewer2000271 Software Developer at Labviva · Answer 22 · 2022-12-06T20:54:00Z

We’re currently using logging, monitoring, metrics, APM, etc. We've started to use e-SLOs, however, it takes a bit of time to work through those. RUM has been very useful. I have used this in the past to debug problems in production, which has been g great. We also want to start using synthetics and tracing more. Our application currently runs in many different environments based on our customers' requirements. This allows us to see everything in one place and filter by environment as required, which is extremely useful.

score 0 · Answer 23 · 2022-12-06T20:48:00Z

The main use case is observability and reliability as part of a platform/delivery engineering solution. We use the product to assist tenants and clients within the company to get more ramped up on SRE/DevOps.

score 0 · Answer 24 · 2022-12-06T20:44:00Z

reviewer2045043

Software Engineer at a comms service provider with 5,001-10,000 employees

Real User

Dec 6, 2022

We primarily use the product for tracing, metrics, and alarms in various deployment environments.

score 0 · Answer 25 · 2022-12-06T20:40:00Z

reviewer2045034

Sr. Manager - DevOps at a aerospace/defense firm with 10,001+ employees

Real User

Dec 6, 2022

We primarily use the solution for logging and APM, and for real user metrics.

reviewer2045022 Software Engineer at Marqeta · Answer 26 · 2022-12-06T20:26:00Z

We use Datadog to monitor our Kubernetes clusters. We have 3 different clusters for different parts of the SDLC. We run the Datadog agent DaemonSet as well as the Datadog cluster agent. Our services have the APM installed by default. To create monitors, we use Terraform. This is provided out-of-the-box for our service owner. We run EKS on top of K8s, therefore, we also make use of some of the AWS monitoring capabilities that can be integrated into Datadog. We are hugely reliant on Datadog for all aspects of our system.

reviewer2045010 Lead Application Developer at a retailer with 10,001+ employees · Answer 27 · 2022-12-06T20:20:00Z

reviewer2045010

Lead Application Developer at a retailer with 10,001+ employees

Real User

Dec 6, 2022

We primarily use the solution for monitoring and log analysis.

score 0 · Answer 28 · 2022-12-06T20:16:00Z

We primarily use the solution for application monitoring (APM, logs, metrics, alerts). It's useful for active monitoring (static monitors, threshold monitors). We get a lot of value out of anomaly detection as well. SLOs and monitoring of SLOs have been another value add. In terms of metrics, the out-of-the-box infrastructure metrics that come with the Datadog agent installation are great. We have made use of both the custom metrics implementation as well as the log-based metrics which are extremely convenient. We also leverage Datadog for use of RUM and want to explore session replay.

score 0 · Answer 29 · 2022-12-06T19:56:00Z

We primarily use Datadog for alerts. If we're running out of database connections or CPU credits we want to find out in Slack. Datadog provides nice features for that. Secondarily, we use Datadog for analyzing historical trends and forecasting potential issues. I'm trying to learn how to add in Continuous Profiler in our primary backend servers and set up Synthetic Tests for monitoring our front end. Everything is mostly on AWS, and the Datadog integrations help a ton.

score 0 · Answer 30 · 2022-12-06T19:50:00Z

We are providing managed services to our customers across multiple industries. Datadog is key to delivering these services by bringing the observability, monitoring, and alerting capabilities we need to operate at scale. We operate custom cloud native workloads as well as ISV products such as Atlassian Jira or Confluence. Integrating Synthetics, infrastructure, and application performance monitoring, as well as piping all logs through Datadog allows us to operate more with less with good alerting right in time.

score 0 · Answer 31 · 2022-12-06T19:42:00Z

Datadog provides us with a solution for data ingesting for all of our application metrics, resource metrics, APM/tracing data etc. We use it for use in dashboards, monitoring/alerting, SLO targets, incident response etc. We have a lot of applications across multiple languages/frameworks etc., and have deployed in Kubernetes across multiple regions in AWS, along with underlying managed resources such as SQS, Aurora, etc. Datadog makes understanding the state of these seamless. We are a company with millions of daily active users, and this level of detail is excellent.

score 0 · Answer 32 · 2022-12-06T19:26:00Z

reviewer2044953

Senior Engineering Manager,Mobile Wireless Engineering at a comms service provider with 10,001+ employees

Real User

Dec 6, 2022

The product is primarily used for the DevOps team.

reviewer2004336 Software Engineer at a tech vendor with 1,001-5,000 employees · Answer 33 · 2022-10-26T05:30:00Z

We use the solution for application hosting and a little bit of everything when it comes to supporting a worldwide logistics tracking service. It's used as a central service for collecting telemetrics and logs. We find it does the same work as all of our old tools combined, including Prometheus, Kibana, Google Logs, and more; putting all of this information in a single platform makes it easy to corroborate information and associate a request with the data, which might be lost when it is saved as logs.

score 0 · Answer 34 · 2022-10-26T00:18:00Z

We use the solution for monitoring time spent on views and events triggered. For example, for one of our products, we have created a custom dashboard that lets us track all the custom events and multiple entry points in the same part of the application. Knowing the entry point helps us choose which part of the program should be improved next. It also helps us with collecting important data about the overall usage of each module within our application.

score 0 · Answer 35 · 2022-10-26T00:14:00Z

We collect all data logs from all operating systems, such as Windows, Linux, VMware, and bare metal data centers. We also automatize the installation of the agent on servers. Now we are starting a POC to analyze the APM module. In the feature, the next step is to do a POC of security modules. The final idea is to have a unique portal for observability. This will make it easy to troubleshoot and for layer levels 1 and 2.

reviewer2004204 manager at a financial services firm with 501-1,000 employees · Answer 36 · 2022-10-26T00:10:00Z

We use the solution for logs from all our applications. In Datadog, for monitoring logs, our team creates an automation for implementing massive logging in all our systems. Now, we are deploying it in our core systems.

score 0 · Answer 37 · 2022-10-26T00:06:00Z

We use different tools for log collection and monitoring. Using Datadog will combine different use cases into one product that will be easier to manage. The tools we use are open-source, so there is no commercial support. Having customer support would be ideal since we're a small team. Profiling would be another great feature to have. Currently, it's manual. Having Datadog would give us a standard, and we don't have to do much manual work.

score 0 · Answer 38 · 2022-10-26T00:02:00Z

We use the solution for monitoring our logs across distributed clusters. Right now, we have an Elasticsearch solution that is tied to each platform (our product is a PaaS solution). We are looking at moving to a single pane of glass solution, which Datadog would be good for (plus, we could wrap up other tools like Prometheus, Grafana, Pagerduty, Pingdom, and more). We want to be able to have Datadog running on one single cluster and ingesting and processing logs from all our distributed clusters.

reviewer2004192 Lead Support Engineer at a tech vendor with 11-50 employees · Answer 39 · 2022-10-25T23:57:00Z

Our use case is mainly deploying into our applications for monitoring/logging observability. We currently have our microservices feed into an actuator that exists in each instance of our application that extends to a local and central Grafana for client and internal visibility. The application we use is Grafana. Logging captures application and system logs that are ported to each application instance for querying. Whenever anything occurs that is considered unhealthy from a range of health checks, we have notification rules configured internally and externally for a prompt response time.

reviewer2004189 Senior Software Engineer at Marqeta · Answer 40 · 2022-10-25T23:53:00Z

We use this solution to monitor our Kubernetes clusters, nodes, deployments, daemon sets, replica sets, and pods.

score 0 · Answer 41 · 2022-10-25T23:50:00Z

The main use cases are to provide visibility to costs for each product in the company as well as to consolidate all the observability in one tool. We are moving the team from being an operational team that needs to keep the tool up and running (applying patches and resolving problems) to a team that is focused on providing meaningful visibility of the systems, applications, and services of the company. We want to add value where the developers and the systems administrators are not able to focus.

score 0 · Answer 42 · 2022-10-25T23:46:00Z

We use the solution for testing all of our application's endpoints. It is making sure that they work on a consistent basis.

reviewer2004177 Cloud Engineer at a retailer with 51-200 employees · Answer 43 · 2022-10-25T23:34:00Z

I am using the solution for monitoring metrics, logs, traces, etc. It's mainly for making dashboards as well as monitoring our services. We also use Datadog to help centralize our incident management to show the logs, where issues spiked, and some metrics. We use Datadog to do troubleshooting in Kubernetes, specifically in our Azure Kubernetes service. Beyond that, we are looking to use open telemetry in tandem with Datadog to further our log-tracing efforts. In the future, this may be expanded.

score 0 · Answer 44 · 2022-10-25T23:30:00Z

I have been using Datadog products and capabilities increasingly over the last 4 years, from POC to widespread adoption. The capabilities we use are unique for each use case and can be combined in various ways to provide the full observability coverage needed to maintain stable operations and shift from becoming more reactive to proactive. Our organization uses both site/service reliability for the range of backend and frontend services, custom monitoring, and dashboards that can be dynamic and reused for multiple teams.

reviewer2004171 Software Engineer at Lattice · Answer 45 · 2022-10-25T23:26:00Z

We use actual user monitoring and have set up thresholds for alerts to PagerDuty, Sentry, Slack, and so on. We also have dashboards set up for tracking latency and error rates. As an individual contributor, I also try to set up dashboards for the individual feature projects I work on. I'd like to learn more ways to use this, though, especially when it comes to more proactive approaches to issues. A starter pack of common-use types would be nice.

score 0 · Answer 46 · 2022-10-25T23:23:00Z

Our use case is to provide cloud organization application monitoring. I use it for insight into what host in what region has activity or what market is using Datadog to its fullest potential and utilizing that for cost. This may also help determine who is using monitoring and setting alerts or just setting up monitoring and not doing anything about it. The use case can also be to check when the host or applications are down, or if the usage of CPU, memory, etc, is too high.

Jon Schwartz Senior Software Engineer at LeafLink · Answer 47 · 2022-10-25T22:50:00Z

We use Datadog to view and aggregate logs and monitor all of our services. We have a lot of running infrastructure and it is very convenient to have logs and metrics all aggregated somewhere we can view and chart them. I use Datadog to create dashboards and runbooks, and sharable graphs, which really help out my whole team. We mostly use logs and APM, yet have been starting to use other products. I would like to use more synthetic monitors.

reviewer2004069 support Eng · Answer 48 · 2022-10-25T22:47:00Z

We use the application for our application monitoring, data security monitoring, and log management. What we like about the application is that it helps us to track issues more proactively instead of reactively. There are other improvements we would like to see. 1. Being able to restrict users from seeing or viewing specific dashboards once they log in 2. They can cut down the prices for Cloud SIEM. It seems very useful, however, the prices are high. Some organizations are finding it difficult to make decisions in terms of getting the tool.

score 0 · Answer 49 · 2022-10-25T22:42:00Z

We currently use it for log aggregation and SEIM. We send logs from our AWS account (particularly our Cloudtrail and S3 logs) and use them to give us security signals. This has helped with our SOC2 certification process and has given us a window into our processes and the security holes in our system. We are also considering using the APM features to help with our development effort. We want to be able to profile all of our code and see what is going on with it.

reviewer2004024 SRE at S&P Global · Answer 50 · 2022-10-25T22:38:00Z

We deploy various services for our main platform on AWS across multiple regions. We have a development environment, a staging environment, a QA environment, and a production environment. We deploy our many services across hundreds of instances. We have many server farms, all responsible for various services on our market intelligence platform. The deployment of each server farm or even individual instances varies depending on what stood up. We have instances built in three different ways, with two different pipelines and some even on user data scripts.

reviewer2004021 Associate at a financial services firm with 10,001+ employees · Answer 51 · 2022-10-25T22:34:00Z

We use the product for recording loggers on our various services across different teams. For example, we use logs to keep track of info logs for events and error logs to catch exceptions. When users ask us to investigate a situation, we use logs to keep track of events and where the user's code traveled to. We also use synthetic testing and monitoring features to keep track of our many alerts in the production and QA environments.

Ian Schell Senior Site Reliability Architect at Cvent · Answer 52 · 2022-10-25T22:28:00Z

We use Datadog for general observability into our infrastructure, as well as running analytics queries for our SLI/SLO platform. This helps all of our teams be informed of how well their products are actually performing in production, and aim their efforts at the thing that will provide the highest ROI. We also use it for general monitoring and alerting during load tests and service releases to detect any issues related to the deployments. This helps us maintain our high contractual uptime promises to our clients.

score 0 · Answer 53 · 2022-10-25T22:23:00Z

We use it to monitor and alert our ECS instances as well as other AWS services, including DynamoDB, API Gateway, etc. We have it connected to Pagerduty for alerting all our cloud applications. We also use custom RUM monitoring and synthetic tests for both our internal and public-facing websites. For our cloud applications, we can use Datadog to define our SLOs, and SLIs and generate dashboards that are used to monitor SLOs and report them to our senior leadership.

score 0 · Answer 54 · 2022-10-25T22:19:00Z

We are using the solution for scaling up the website for market data applications. EC2 and Datadog have enabled high-level monitoring of underlying infra and services. The Datadog profiler comes in handy to pinpoint issues with resource utilization during peak hours, and traces/log management helps narrow down the root cause. The network map is crucial in identifying bottlenecks and determining what needs more attention. Host map helps identify problematic hardware and devise ways to counter issues that arise during scaling, and deploying solutions on the cloud.

reviewer2003934 SRE at Bloomfire, Inc. · Answer 55 · 2022-10-25T22:14:00Z

We are using Datadog for server metrics, log aggregation and searching, system monitoring, alerting the team about errors, and dashboards for our developers. It's used by the Site Reliability Engineering team and Management of all levels. It's assisting us in proving SOC II compliance. We're looking to improve our usage of Datadog's RUM and APM components to get better and more performance insights on our production environments. We're also looking to leverage more synthetic monitors and runbooks for anyone responding to incidents.

score 0 · Answer 56 · 2022-10-25T22:11:00Z

We use it mostly for logging log messages from our Kubernetes and EC2 instances, for example, system messages and errors. Also, we want log messages from our firewalls and other network infrastructure in case of network issues. We intend to use it for application logging, et cetera, to get insight into internal problems in the applications in Kubernetes pods. We want to use it for monitoring in case of system problems and hardware failures so that it can notify us.

reviewer2003823 Cloud Operations Engineer at Thales · Answer 57 · 2022-10-25T22:02:00Z

We primarily use the solution for monitoring applications and informing customers via Pagerduty and Statuspage. The monitoring and alerts can be personalized internally, and we are able to find problems and issues. The response time monitor has been great, and it has been validating upgrades. We can check in to see which step fails,

score 0 · Answer 58 · 2022-10-25T21:56:00Z

We primarily use the solution for log management and application performance monitoring. We have been getting into using more solutions on Datadog, such as runbooks, monitoring, and dashboards. Another area that we've been investing some time in is the database monitoring. We've been able to get some relatively new employees onboarded into the tool, and they've been able to create some meaningful dashboards and reports without too much hand-holding at all. We plan on exploring the synthetics solution as well.

score 0 · Answer 59 · 2022-10-25T21:52:00Z

We use Datadog for application logs, error tracking, performance tracking, alerting, and overall production state surveillance. It helps us improve observability and ease of maintenance through better information for our support teams and their issue qualification. We also use dashboards to keep all the information at ready and easy to access. SLOs notably for our uptimes but also our feature usage. It also feeds our alerting for our on-call SREs into PagerDuty by launching alerts when specific parameters are exceeded.

Ramon Snir CTO at a tech vendor with 1-10 employees · Answer 60 · 2022-10-25T21:41:00Z

We use Datadog for three main use cases, including: * Infrastructure and application monitoring. It is ensuring that our services are available and performant at all times. This allows us to proactively address incidents and outages without customers contacting us. This includes monitoring of cloud resources (databases, load balancers, CPU usage, etc.), high-level application monitoring (response times, failure rates, etc.), and low-level application monitoring (business-oriented metrics and functional exceptions to customer experience. * Analyzing application behavior, especially around performance. We often use Datadog's application performance monitoring on non-production environments to evaluate the impact of newly introduced features and gain confidence in changes. * End-to-end regression testing for APIs and browser-based experiences. Using Datadog's synthetic testing checks periodically that the system behaves in the exact correct way. This is often used as a canary to detect issues even before users reach them organically.

reviewer2003652 Senior Director at Nasdaq Stock Market · Answer 61 · 2022-10-25T21:35:00Z

We primarily use the solution for the RUM, security monitoring, and streams. We need to monitor users and what they access. We also need to identify security loopholes and attack patterns and identify and quickly respond to issues. We can identify pushbacks, and get insight into application components that stack up with each other. We can understand which components, libraries, and code to alert teams. Using Datadog, we can raise incidents, track incidents to completion, and be able to gather data for reporting and post-mortem. The solution allows us to track fixes and tracks their test coverage. With it, we get confidence in the fix/improvement phase and be able to provide a response.

reviewer2003616 Production engineer at a consultancy with 51-200 employees · Answer 62 · 2022-10-25T21:28:00Z

We have deep integration with Datadog for observability and monitoring. We use everything from APM, logs, and RUM to monitor and dashboards for tracking system health. We are trying to move from many different solutions for error tracking/observability to a single platform (Datadog). We are currently in the process of setting up logging in Datadog in order to maintain our logs better. We are looking to create more insights into the real user flows by using real user monitoring (RUM) too.

Brian Hanuska Architect at SEI Investments · Answer 63 · 2022-10-25T21:10:00Z

We primarily use Datadog for: * Native memory * Logging * APM * Context switching * RUM * Synthetic * Databases * Java * JVM settings * File i/o * Socket i/o * Linux * Kubernetes * Kafka * Pods * Sizing We are testing Datadog as a way to reduce our operational time to fix things (mean time to repair). This is step one. We hope to use Datadog as a way to be proactive instead of reactive (mean time to failure). So far, Datadog has shown very good options to work on all of our operational and development issues. We are also trying to use Datadog to shift left, and fix things before they break (MTTF increase).

score 0 · Answer 64 · 2022-10-25T21:04:00Z

We use Datadog incident management for our incident tooling. Whenever we run into an incident, we try to use it. It allows us to create a separate Slack channel for it.

score 0 · Answer 65 · 2022-10-25T21:00:00Z

We use the solution primarily for platform monitoring for the services that are deployed in AWS. It gives a better way to monitor the services, including pods, cost, high availability, etc. This way, observability is ensured and also customer services are uninterrupted. Also, we host the data pipelines between the cloud and the on-prem for which Datadog is used to ensure better services. We report issues based on the metrics reported over it.

score 0 · Answer 66 · 2022-10-25T20:54:00Z

Ingesting data from various sources to monitor the log metrics of the system and enabling an alert mechanism to notify the teammates if something goes wrong. More specifically, having Datadog agents as integration to different services provides easy access and management.

reviewer2003469 SRE at Magnolia International · Answer 67 · 2022-10-25T20:48:00Z

We use an enterprise version of a CMS platform which is enabling businesses to transmit content to their customers. The tool is fully customizable to the end user, including out-of-the-box integrations as well as APIs for custom plugin support. Our systems fully manage content using AWS as the back-end cloud provider. Assets are kept in secure buckets and utilize the Kubernetes infrastructure to deliver our product to end users and internal authors. Using the CMS allows for business people to manage content without needing development efforts.

reviewer2003355 DevOps Engineer at a printing company with 51-200 employees · Answer 68 · 2022-10-25T20:36:00Z

Log aggregation for us was a key component since we have a fairly old-school app running on VMs on bare metal. We previously didn't have much insight into our logs unless we manually tunneled them into each server. The solution is reducing manual labor in troubleshooting problems in our environments server by server. We also needed to monitor our Java app and MySQL database to understand their problems so that we could take action and resolve them. Our use cases have since expanded to encompass all aspects of monitoring.

score 0 · Answer 69 · 2022-10-25T20:17:00Z

We use metrics to track the metrics of our application. We use logging to log any errors or erroneous application behavior as well as successful behavior. We use events to log successful steps in our pipeline or failed steps in our deployment. We use a combination of all these features to diagnose bugs. It makes it much more efficient to look at all the data in one place. This speeds up our development speed so that we can be agile.

score 0 · Answer 70 · 2022-10-25T20:09:00Z

After a security incident, we needed to find and migrate to a different cloud provider, and after evaluating different competitors and the skill set of the team, we decided to move to AWS. AWS also enables the team to have finer control over how our apps are deployed and how security and access are managed. By leveraging AWS's functionality, we have increased our application's security and sped up the deployment process. We've even been able to handle higher workloads due to AWS's auto-scaling functionality.

reviewer2003214 Sr. Director of Software Engineering at Globalization Partners · Answer 71 · 2022-10-25T19:58:00Z

The RUM is implemented for customer support session replays to quickly route, triage, and troubleshoot support issues which can be sent to our engineering teams directly. Customer Support will log in directly after receiving a customer request and work on the issue. Engineers will utilize the replay along with RUM to pinpoint the issue combined with APM and Infra trace to be able to look for signals to find the direct cause of the customer impact. Incident management will be utilized to open a Jira ticket for engineering, and it integrates with ITSM systems and on-call as needed.

reviewer2003202 Architect at a comms service provider with 10,001+ employees · Answer 72 · 2022-10-25T19:45:00Z

We use the solution primarily for distributed tracing, service insight and observability, metrics, and monitoring. We create custom metrics from outbound service calls to trace the availability of back-office systems. We use the flame graph to get insights into our GraphQL implementation. It helps highlight how resolvers work. However, it's lacking in tracing which GraphQL queries are run, and we use custom spans for that.

reviewer2002896 VP at JPMorgan Chase & Co. · Answer 73 · 2022-10-25T17:37:00Z

The product is used for APM solutions for the metrics and traces for the REST API requests and service maps to understand the upstream and downstream services. We are creating dashboards and widgets to monitor the status. We are creating alerts and monitors as well. We integrated the alerts and ticketing system in our organization with SNOW and Netcool. We are using Kubernetes, AWS, and infrastructure metrics. We are using Kafka and Aurora Postgres logs as well, and we are using HTTP status codes to identify the error types.

reviewer2002893 Lead Software Engineer at a retailer with 51-200 employees · Answer 74 · 2022-10-25T17:33:00Z

We are trying to get a handle on observability. Currently, the overall health of the stack is very anecdotal. Users are reporting issues, and Kubernetes pods are going down. We need to be more scientific and be able to catch problems early and fix them faster. Given the fact that we are a new company, our user base is relatively small, yet growing very fast. We need to predict usage growth better and identify problem implementations that could cause a bottleneck. Our relatively small size has allowed us to be somewhat complacent with performance monitoring. However, we need to have that visibility.

score 0 · Answer 75 · 2022-10-25T09:29:00Z

We use the solution for monitoring, logging, and alerts. Thanks to Datadog, we report errors using the logger integrated into our services, which is crucial since we only do unit tests. The infrastructure team handles the monitoring part, so I can't give more insights about that. I am an API developer, so I use Datadog mainly for logging. The alerts are connected to Microsoft Teams in a specific channel, and we pay a lot of attention to it, and we usually create tickets based on these alerts.

LuWang DevOps Engineer at Screencastify · Answer 76 · 2022-10-24T03:36:00Z

We use Datadog for observability and system/application health, mainly for product support, triaging, debugging, and incident responses. We use a lot of the logging and the Datadog agent to collect logs, metrics, and traces from our GKE workloads. We use APM and continuous profiling for latency and performance measurement. We use RUM to observe frontend user events, such as tracing on request and what actions they take before errors occur. We also use error tracking and source maps to debug production failures. We are still relatively new to the product, and we are planning to use more of the notebook functionality and power packs to record run books and break knowledge silos. We also need to utilize dashboards and continuous profiling more for performance measurement and integrate Datadog alerts for incident response.

Plinio Moreira Sales Engineer at Delfia · Answer 77 · 2022-10-24T03:30:00Z

I'm a Datadog partner in Brazil, and I monitor all my applications with Datadog too. I would like to enable all features in my DPN portal and get access to custom demos. We resell Datadog and a full stack of pre-sales, sales, and post-sales services. We have customers for all sectors, including governmental, financial services, services in general, telecom, et cetera. Today, we are the biggest Datadog partner in Brazil, and we are searching for an expansion in our MSP environment.

score 0 · Answer 78 · 2022-10-24T03:26:00Z

The solution is primarily used for better understanding the health of applications, modern environments, and many other solutions, which are the main focus of Datadog and many other monitoring tools. With Datadog specifically, I can look at the health of the technology stack and services, and also integrate multiple metric sources, security, business data, and much more. This makes it a real software solution for centralizing data and unifying monitoring silos in one place. Datadog is like a hub - not just a monitoring software.

score 0 · Answer 79 · 2022-10-24T03:16:00Z

We're moving towards the cloud yet still have several active data center contracts. As we move to the cloud, we are interested in knowing more about our services, and DataDog APM/logs should give us this perspective. We currently use the infrastructure monitoring part of DataDog. Still, I've really seen the advantage of moving more data into the cloud for comparison and being able to have one place where we can view all related pieces of information regarding a possible incident or potential issue.

score 0 · Answer 80 · 2022-10-24T03:12:00Z

We are using a mixture of on-prem and cloud solutions to bridge the gap with healthcare entities in the service of providing patients with the medication they need to live healthy lives. Since we're a heavily regulated company, a lot of our solutions grew from on-premises monoliths. However, as we scaled out, it became harder and harder to move forward with that architecture. Today, we're investing heavily in transforming our systems from monoliths into distributed systems. With this change in mind, the ability for us to connect the dots using Datadog has been invaluable.

score 0 · Answer 81 · 2022-10-24T03:08:00Z

reviewer2000475

Security Engineer at a computer software company with 11-50 employees

Real User

Oct 24, 2022

We primarily use the solution for security monitoring and anomaly detection.

score 0 · Answer 82 · 2022-10-24T03:04:00Z

I use the solution to manage security-related logs and metrics, as well as create detection rules for security events. I am a security engineer, so one area of interest is the CSPM product, giving us the ability to look at findings across the cloud environment. The great part about the Datadog security products is that they incorporate the context of the resources/hosts where the security event is found. This allows us to see exactly what is running on a host that we see as a security alert.

score 0 · Answer 83 · 2022-10-24T02:53:00Z

We are using the solution for migrating out of the data center. Old apps need to be re-architected. We are planning on moving to multi-cloud for disaster recovery and to avoid vendor lockouts. The migration is a mix between an MSP (Infosys) and in-house developers. The hard part is ensuring these apps run the same in the cloud as they do on-premises. Then we also need to ensure that we improve performance when possible. With deadlines approaching quickly it's important not to cut corners - which is why we needed observability

score 0 · Answer 84 · 2022-10-24T02:49:00Z

We use Datadog for observability and monitoring primarily. Various cross-functional teams have built various dashboards, including Developers, QA, DevOps, and SRE. There are also some dashboards created for senior leadership to keep tabs on days to day activities like cost, scale, issues, etc. Also, we've set up monitors and alarms that kick off when any metrics go beyond the threshold. With Slack and PagerDuty integration, correct team members get alerted and react to solve the issue based on various runbooks.

score 0 · Answer 85 · 2022-10-24T02:36:00Z

We are using the solution for migrating out of the data center. Old apps need to be re-architected. We plan to move to multi-cloud for disaster recovery and avoid vendor lockouts. The migration is a mix between an MSP (Infosys) and in-house devs. The hard part is ensuring these apps run the same in the cloud as they do on-prem. Then we also need to ensure that we improve performance when possible. With deadlines approaching quickly, it is important not to cut corners which is why we needed observability.

reviewer2000451 SRE at a financial services firm with 10,001+ employees · Answer 86 · 2022-10-24T02:28:00Z

We primarily use the solution for observability, metrics, logs, tracing, and end-to-end user flow monitoring. We are looking to implement this as a company-wide standard for cloud solutions. At this time, we're currently in a POC, and we're interested in using either a Datadog agent or the OTel agent with a Datadog exporter. We have dashboards with panels that correlate metrics and allow you to link through to traces. Flame graphs to show latency across services and the various spans. While we are not security minded, we still require it and are interested in more. It's used for monitoring critical systems.

score 0 · Answer 87 · 2022-10-24T02:20:00Z

This solution is for physical device monitoring across breweries, including PLCs, HMI Cameras, RFID panels, scales, etc. We want to gain visibility into these devices to influence predictive maintenance and unscheduled downtime. We want to monitor physical devices across the zone from a control tower perspective for end users and support teams alike. Understanding more about the performance of the devices and mechanical components will allow us to schedule downtime to fix imminent catastrophic failures and prevent unplanned downtime and lost revenue.

reviewer2000271 Software Developer at Labviva · Answer 88 · 2022-10-20T16:33:00Z

We’re currently using logging, monitoring, metrics, APM, etc. We've started to use קSLOs. However, it takes a bit of time to work through those. RUM (Real User Monitoring) has been very useful. I have used this in the past to debug problems in production, which has been great. We also want to start using synthetics and tracing more. Our application currently runs in many different environments based on our customers' requirements. This allows us to see everything in one place and still filter by the environment as required, which is extremely useful.

reviewer1996905 VP, Application support at JPMorgan Chase · Answer 89 · 2022-10-20T13:15:00Z

We primarily use the solution for the service catalog. We use this type of offering for our Microservices applications, and it gives a good view of flow. It is a must when we have different developers working on different services. Having the trace and log features are useful for locating the microservice for the on-call person. We would like to see some more useful applications for health monitoring where we can customize the cases based on data from the database. It needs to have the facility to monitor data inside tables and the status of the UI.

reviewer1996521 Engineering Manager at Indeed.com · Answer 90 · 2022-10-19T16:22:00Z

I primarily use the solution to learn, watch and monitor business and engineering metrics in the production and QA environments of my team. We create monitors on key business metrics and observe regressions and anomalies. Less often, I leverage the events ability in Datadog to get notified about significant activities happening in my teams' deployments. We learn about Datadog monitor alerts through Slack and often attempt to create SLOs using Terraform. We use APM for observability. Most recently, I learned about WatchDog Alerts that I will be heavily looking into.

reviewer1996518 ITOPS and SRE Manager at Ticket · Answer 91 · 2022-10-19T15:44:00Z

reviewer1996518

ITOPS and SRE Manager at Ticket

User

Oct 19, 2022

We primarily use the solution for observability.

reviewer1996494 Director of Software Engineering at Code Climate · Answer 92 · 2022-10-19T14:00:00Z

We primarily use the solution for charting application metrics. We use it for all our application metrics, host metrics, and monitors with a PagerDuty integration. We integrate our application logs. It is great to be able to tie our metrics and our traces together. We use the APM module with traces. It is great to be able to link APM, logs, and metrics in one go, as it shortens our troubleshooting and RCA dramatically. We are loving the tool; it is great to have all those insights in one place. We hope that they keep making my life and our engineers' life easier.

reviewer1996524 Director Of Software Development at Major League Baseball · Answer 93 · 2022-10-19T13:22:00Z

We primarily use the solution for monitoring and telemetry. We use lots of log collections, log-based metrics, and dashboard visualization. The logging, metrics, and APM are vital.

reviewer1996488 Software Engineer at Spring Health · Answer 94 · 2022-10-19T13:17:00Z

We share dashboards, set up alerts, and monitor everything that happens in our system. We use it in staging, features, production, and our load test environment. It is exceptionally helpful for making our engineering more data-driven. I came from a company that believes we should focus on being telemetry driven. Instilling this in a smaller, less mature engineering organization has been challenging. However, it is much easier while using Datadog.

reviewer1994838 Software Engineer at Enable Medicine · Answer 95 · 2022-10-18T21:08:00Z

We mostly use it to handle log aggregation, monitor our web application, and alert us on data pipeline failures. Our system is fully on AWS, and so we pipe in all of our Cloudwatch logs into Datadog to have a central place to index and search logs. Our web app is built on an Elastic Beanstalk backend, and we use the Datadog agent to keep track of all of the requests that hit our backend and all of their components. We also use the prebuilt AWS pipeline dashboards to monitor our batch jobs and lambdas.

reviewer1994829 Software Engineer at Enable Medicine · Answer 96 · 2022-10-18T21:05:00Z

We primarily use the solution for log monitoring across our entire cloud infra (EB, EC2, Batch, and Lambda). This is in addition to Rstudio Workbench, which has its own logs that would not be picked up via Cloudwatch(docs.rstudio.com). We own several dozen of these servers, and we used to manage instance logs by tailing logs when incidents occurred. Datadog allows for much better visibility across our entire fleet and has saved us countless hours.

reviewer1994826 Senior Software Engineer at Grata · Answer 97 · 2022-10-18T16:57:00Z

reviewer1994826

Senior Software Engineer at Grata

User

Oct 18, 2022

We primarily use the dashboard/metric with many tags.

score 0 · Answer 98 · 2022-10-05T09:22:08Z

Our company is transitioning to using the solution for monitoring and analytic services we provide to customers. Once fully rolled out, there will be 80-100 users companywide.

Nagendra K Lead Blockchain and Back-End Developer at Torum Technology · Answer 99 · 2022-09-28T06:40:22Z

We are using the solution from a monitoring and management perspective. We use it for alerts.

score 0 · Answer 100 · 2022-09-19T20:07:35Z

Our company deploys the solution for our customers as an observability tool to define SLOs and SLIs along with logs and metrics. The solution includes incident, post-mortem, and root cause analysis that provides a level of truth for incidents and issues with applications. We have SREs and teams in operations, management, and applications who all access to the solution and ensure proper integrations.

Rawat Singhsatit Solutions Consultant Manager at MFEC · Answer 101 · 2022-08-30T15:08:28Z

We use this solution for our customer's IP and to support their cloud infrastructure.

score 0 · Answer 102 · 2022-08-15T10:42:13Z

Datadog is a SaaS solution we tried for URL and synthetic monitoring. You record a transaction going into a website and replay that transaction from various locations. Datadog is mainly used by the admin, but three or four other guys had access to the reports and notifications, so it's five altogether. We probably tried no more than 8 percent of what Datadog can do. There are so many other bits and modules. I've only gone into about half of what APM can do in the Datadog stack.

reviewer1915611 Principal Solutions Architect at DoControl · Answer 103 · 2022-07-18T07:36:47Z

One of the things we use it for is the same thing that we use FullStory for, which is to replay customer interactions with our platform. However, it also does the monitoring. It's like monitoring cloud tools. We're really mostly monitoring our own software to make sure that everything is functioning properly. We can check a bunch of things, and we can even play back customer sessions. It’s basically monitoring our application.

Nuno Rosa Principal Consultant at Infosys · Answer 104 · 2022-06-07T11:17:23Z

Nuno Rosa

Principal Consultant at Infosys

MSP

Top 10

Jun 7, 2022

The solution is basically used for servers and applications.

score 0 · Answer 105 · 2022-03-29T15:58:56Z

Our primary use case is log management and we also use the solution for monitoring the application and underlying infrastructure. I'm an IT test manager.

Jaswinder Kumar Senior Manager - Cloud & DevOps at Publicis Sapient · Answer 106 · 2022-02-20T17:26:13Z

My customers were using Datadog for monitoring purposes. They were using it only because the solution is running on AWS and it's a microservices-based solution. They were using an application called Dynatrace for their log.

score 0 · Answer 107 · 2022-02-09T19:11:59Z

We are evaluating Datadog for observability and monitoring requirements that we have in our company. In our use case, our intention is to provide some kind of framework for multiple app teams to use the tool for our cyber ability and engineering practices.

score 0 · Answer 108 · 2022-02-08T21:47:54Z

We have used this solution primarily for application performance monitoring. To do this, we needed to make sure we had the right data in the system so that people could be able to monitor their applications end-to-end.

score 0 · Answer 109 · 2022-02-04T12:22:43Z

reviewer1775157

Chief Strategy Officer (CSO) at a computer software company with 11-50 employees

Real User

Feb 4, 2022

We use Datadog to monitor our product on the cloud.

score 0 · Answer 110 · 2021-12-27T19:28:28Z

I'm not sure which version we're using, although I believe it to be the latest. We essentially use the solution in advance of performance testing, performance monitoring, and troubleshooting.

reviewer1530834 Director at a media company with 11-50 employees · Answer 111 · 2021-10-18T08:28:04Z

We have a web infrastructure that uses Amazon Web Services containers with everything included, and we use Datadog to monitor them all.

score 0 · Answer 112 · 2021-09-09T19:57:09Z

AM

Attila Mate Kovacs

Senior Cyber Security Expert at a security firm with 11-50 employees

Real User

Sep 9, 2021

We implement these solutions for our clients. We have implemented Datadog as an SIEM solution.

score 0 · Answer 113 · 2021-08-13T08:08:14Z

RG

RomainGiacalone

Software Engineering Manager at a tech vendor with 51-200 employees

Real User

Aug 13, 2021

I am using Datadog for error reporting.

score 0 · Answer 114 · 2021-08-13T05:36:43Z

I used Datadog typically for monitoring website statistics and some of the cloud networking equipment.

GAD COBBINA Security Analyst at a tech services company with 11-50 employees · Answer 115 · 2021-06-01T14:22:04Z

We are currently testing it. If the testing goes well, we'll purchase the full version, and it will probably be our main monitoring tool. We plan to use it for monitoring our activities and the attacks on our systems or network.

score 0 · Answer 116 · 2021-05-18T17:10:08Z

reviewer1533330

Project Director at a tech services company with 501-1,000 employees

Real User

May 18, 2021

We use it for our infrastructure network and servers.

Fernando Cunha JR Tech Lead & Solutions Architect at DXC · Answer 117 · 2021-04-01T09:42:54Z

FC

Fernando Cunha JR

Tech Lead & Solutions Architect at DXC

Real User

Apr 1, 2021

I implement this solution for clients.

reviewer1539903 Director of Cloud Operations at CloudHesive · Answer 118 · 2021-03-29T17:00:07Z

Our clients use it for monitoring applications. Its deployment depends on our customer's use case. It is 100% cloud. We have got a multi-tenant environment, so we segment it out.

score 0 · Answer 119 · 2021-03-23T20:06:32Z

We use it for monitoring and instrumentation of security. We secure our databases and servers. It is typically for the security of apps, services, and systems. We are using its latest version.

Pablo Albornoz Co-Founder at Exeo IT · Answer 120 · 2021-03-12T00:27:24Z

PA

Pablo Albornoz

Co-Founder at Exeo IT

Real User

Mar 12, 2021

We're in the process of doing a Proof of Concept with the solution right now.

score 0 · Answer 121 · 2021-02-19T21:40:43Z

We deploy agents on-premise to collect data on on-premise VM instances. We don't use Datadog in our cloud network. We do have some Cloud apps that we have it on and we also have Containers. We have it on their headquarters, the main software for them is on their own Cloud. Eventually, we're building out the process now and using it better. We plan to use Datadog for root cause analysis relating to any kinds of issues we have with software, with applications going down, latency issues, connection issues, etc. Eventually, we're going to use Datadog for application performance, monitoring, and management. To be proactive around thresholds, alerts, bottlenecks, etc. Our developers and QA teams use this solution. They use it to analyze network traffic, load, CPU load, CPU usage, and then Tracey NPM, API calls for their application. There are roughly 100 users right now. Maybe there's 200 total, but on a given day, maybe 13 people using this solution.

score 0 · Answer 122 · 2021-01-25T19:36:00Z

We primarily use Datadog for logs, APM, infrastructure monitoring, and lambda visibility. We have built a number of critical dashboards that we display within our office for engineers to have a good understanding of the application performance, as well as business partners to understand at a high level the traffic flowing through the app. We started with logging, as our primary monitor, and have shifted to APM to get a deeper understanding of what our system is doing, and how the changes we are making impact the apps.

reviewer1493811 Sr. Architect - SaaS Ops at CommVault · Answer 123 · 2021-01-21T08:05:00Z

We primarily use DataDog for performance and log monitoring of cloud environments, which include VMs and Azure Services like Azure compute, storage, network, firewall, and app services via event hubs. Alerting based on monitors via teams and PagerDuty. Logs collection for Azure services like Azure database, Azure Application Gateway, Azure AKS, and other Azure services. Custom metrics using a Python script to collect metrics for components not natively supported by Datadog. Synthetic testing to ensure uptime and browser tests via CI/CD pipeline.

reviewer1486134 Infrastructure Engineer at DATACAMP, INC · Answer 124 · 2021-01-12T13:50:00Z

We use Datadog as a monitoring platform to achieve visibility into our container environments. Almost all of our workloads are containerized and with DataDog, we are able to get metrics, logs, alerts, and events about all the containers that we are running. Our developers also extensively use APM to find and diagnose performance issues that might appear. We use Terraform to automatically create all of the necessary monitors and dashboards that our developers need to make sure that our level of service is sufficient.

reviewer1480866 Director of DevOps at Digital Media Solutions Group · Answer 125 · 2020-12-31T20:21:00Z

We primarily use this product for availability and performance monitoring, log aggregation.

reviewer1479957 Senior Director of DevOps at Housecall Pro · Answer 126 · 2020-12-29T18:59:00Z

We primarily use Datadog for the monitoring of EC2 and ECS containers running mostly Rails applications that host a SaaS product. We also monitor ElasticSearch and RDS, and we are working on adding their Application Performance Monitoring solution to monitor our applications directly. We use DataDog to create dashboards, graphs, and alerts based on interesting metrics. DataDog is our first place to look to find the performance of our system. We also use their logging platform and it works well. Especially useful is that the logs and metrics are tightly integrated so you can jump between them easily.

reviewer1477686 Senior DevOps Engineer at DigitalOnUs · Answer 127 · 2020-12-22T19:31:00Z

Our primary use of Datadog includes: * Keeping a close look into our AWS resources. Monitoring our multiple RDS and ElastiCache instances play a big role in our indicators. * Kubernetes. We aren't using all of the available Kubernetes integrations but the few of them that work out of the box adds great value to our metrics. * Monitoring and alerting. We wired our most relevant monitoring and alerts to services like PagerDuty, and for the rest of them, we keep our engineers up to date with constant Slack updates.

score 0 · Answer 128 · 2020-12-20T15:30:00Z

We were in need of a cloud monitoring tool that was operationally focused on the AWS Platform. We wanted to be able to responsibly and effectively monitor, troubleshoot, and operate the AWS platform, including Server, Network, and key AWS Services. Tooling that highlighted and detected problems, anomalies, and provided best practice recommendations. Tooling that expedites root-cause analysis and performance troubleshooting. Datadog provided us the ability to monitor our cloud infrastructure (network, servers, storage), platform/middleware (database, web/applications servers, business process automation), and business applications across our cloud providers.

Abdulla Pathan Technology Competency and Solution Head at LearningMate · Answer 129 · 2020-11-25T16:41:00Z

We use Datadog for application monitor, to help identify errors. It is also used to monitor application performance.

score 0 · Answer 130 · 2020-10-21T04:33:58Z

reviewer1438557

Senior Cloud Security Engineer at a financial services firm with 201-500 employees

Real User

Oct 21, 2020

I'm a senior cloud security engineer and we are customers of Datadog.

Richard Chennault Senior Director with 10,001+ employees · Answer 131 · 2020-09-06T08:04:00Z

RC

Richard Chennault

Senior Director with 10,001+ employees

Real User

Sep 6, 2020

We used Datadog to capture the salvatory of our AWS fleet of around 1,200 servers.

Santhosh Batchu Cloud Architect at a tech services company · Answer 132 · 2020-05-18T07:50:13Z

We are a solution provider and Datadog is one of the products that I was working on with one of my clients. They are currently evaluating it for use in cloud monitoring. Specifically, Datadog is used for monitoring cloud applications in terms of performance. The logs come into this solution from AWS and it provides dashboards for various environments.

SeniorSofcae Senior Solutions Architect at CloudOps · Answer 133 · 2018-12-11T08:31:00Z

We are using the infrastructure and app monitoring side, such as process monitoring. We are using it in a very traditional way. We are not using the APM capabilities. When it comes to something like containers, we will generally use it on the host but not inside the container itself. We are using it with our customers and in-house day-to-day.

Brendan Buono Software Engineer at Lovepop · Answer 134 · 2018-12-11T08:31:00Z

The primary use case is application monitoring. We also use it set custom metrics and watch our AWS metrics, as well as data. At my current job, I have only use it a couple months. However, I used it for a few years at a previous company.

Aaron Singh DevOps Engineer at Spark New Zealand · Answer 135 · 2018-12-11T08:31:00Z

We use it for notifications, alerting, and capturing most of the information from Amazon, such as EC2 instances.

Enrique Yanez Software Engineer at Sony Corporation of America · Answer 136 · 2018-12-11T08:31:00Z

If our app is up and running, we use it to monitor how many credits the app is using up on each node. We also monitor services by how long each call is taking with the help of EC2s off of application.

Director9da2 Director of Engineering at Lucidchart · Answer 137 · 2018-12-11T08:31:00Z

Director9da2

Director of Engineering at Lucidchart

Real User

Dec 11, 2018

* Monitoring * Analytics * Tracing * APM

SystemNia893 System Ninja at a philanthropy with 51-200 employees · Answer 138 · 2018-12-11T08:30:00Z

We use it to monitor our infrastructure, particularly our different EC2 instances, and our containers. We also use it to capture our logs.

score 0 · Answer 139 · 2018-12-11T08:30:00Z

We mainly use it to send metrics about CV and memory usage, in addition to the number of files descriptors on a socket.

Yun Sung Eun Software Developer at AhnLab, Inc. · Answer 140 · 2018-12-09T08:34:00Z

We use it to store editorial content. We started out on the on-premise version, then moved to the AWS version.

score 0 · Answer 141 · 2018-12-04T07:57:00Z

MI

Mark

Site Reliability Engineer at a computer software company with 201-500 employees

Real User

Dec 4, 2018

We use it for custom metrics of our applications and monitoring of our systems.