Try our new research platform with insights from 80,000+ expert users
reviewer1996494 - PeerSpot reviewer
Director of Software Engineering at Code Climate
Vendor
Helpful dashboards, useful data-driven decision-making and good integration with PagerDuty
Pros and Cons
  • "We integrate our application logs. It is great to be able to tie our metrics and our traces together."
  • "The pricing should be less of a surprise."

What is our primary use case?

We primarily use the solution for charting application metrics.

We use it for all our application metrics, host metrics, and monitors with a PagerDuty integration. 

We integrate our application logs. It is great to be able to tie our metrics and our traces together.

We use the APM module with traces. It is great to be able to link APM, logs, and metrics in one go, as it shortens our troubleshooting and RCA dramatically.

We are loving the tool; it is great to have all those insights in one place. 

We hope that they keep making my life and our engineers' life easier.

How has it helped my organization?

The solution improved our organization with:

  • Data-driven decision making
  • Dashboards we can share with our customer success team
  • Dashboards we can share with our sales engineers
  • Help during incidents
  • Help with preventing incidents
  • Integration with PagerDuty.

What is most valuable?

The most valuable aspects of the solution include:

  • The charting application metrics
  • help with the business, prioritization, software design, and infrastructure design.

What needs improvement?

The pricing model hurts and forces us to work around the tool sometimes.

On top of application performance metrics, it would be great to have host performance metrics, suggesting changes to better use a cluster like: "You are over-provisioning this host" or "based on historical data, you will need to scale up in X days."

Adding a module to extract data from Datadog so we can use the data in our own system would be helpful.

Buyer's Guide
Datadog
January 2025
Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: January 2025.
825,609 professionals have used our research since 2012.

For how long have I used the solution?

I've used the solution for six or more years.

Which solution did I use previously and why did I switch?

We previously used New Relic, which was a great tool. That said, Datadog is a more complete solution.

What's my experience with pricing, setup cost, and licensing?

The pricing should be less of a surprise. They should allow us to cap costs which would lead to less frustration.

We need better documentation on the pricing.

It might be helpful if they added a pricing simulator.

Which deployment model are you using for this solution?

Private Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
reviewer1994829 - PeerSpot reviewer
Software Engineer at Enable Medicine
User
Good technical documentation and overall education with improved visibility
Pros and Cons
  • "We've found it most useful for managing Rstudio Workbench, which has its own logs that would not be picked up via Cloudwatch."
  • "We primarily use the log management functionality, and the only feedback I have there is better fuzzy text searching in logs (the kind that Kibana has)."

What is our primary use case?

We primarily use the solution for log monitoring across our entire cloud infra (EB, EC2, Batch, and Lambda).

This is in addition to Rstudio Workbench, which has its own logs that would not be picked up via Cloudwatch(https://docs.rstudio.com/ide/server-pro/server_management/logging.html#default-log-file-locations). 

We own several dozen of these servers, and we used to manage instance logs by tailing logs when incidents occurred. Datadog allows for much better visibility across our entire fleet and has saved us countless hours.

How has it helped my organization?

It is now way easier to search in one place rather than across all of Cloudwatch (and needing to know log groups, etc.). 

Primarily, we run several separate deployments of Rstudio Workbench, which has its own logs that would not be picked up via Cloudwatch. 

We own several dozen of these servers. We used to manage instance logs manually. 

Datadog allows for much better visibility.

What is most valuable?

We've found it most useful for managing Rstudio Workbench, which has its own logs that would not be picked up via Cloudwatch. 

Datadog allows for much better visibility across our entire fleet and has saved us countless eng hours as a result. 

We plan on trying out offerings such as APM moving forward too.

Some things that Datadog does very well:

  • Technical documentation (the docs are clear, concise, and include realistic code samples)
  • Overall education efforts (e.g. the codelabs/workshops)

What needs improvement?

We primarily use the log management functionality, and the only feedback I have there is better fuzzy text searching in logs (the kind that Kibana has). 

I've learned about a ton of other offerings, like APM, NPM, etc., over the course of workshops. Once I try those out, I'm sure I will have additional feedback.

For how long have I used the solution?

I've used the solution for one year. 

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Datadog
January 2025
Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: January 2025.
825,609 professionals have used our research since 2012.
Rawat Singhsatit - PeerSpot reviewer
Solutions Consultant Manager at MFEC
Consultant
Stable cloud monitoring solution that is easy to use and deploy and is budget friendly
Pros and Cons
  • "Datadog is easy to use and easy to deploy. It's a better solution compared to others on the market in terms of being budget friendly for our customers."
  • "Datadog could be improved if it could detect other software in a container or server."

What is our primary use case?

We use this solution for our customer's IP and to support their cloud infrastructure.

What is most valuable?

Datadog is easy to use and easy to deploy. It's a better solution compared to others on the market in terms of being budget friendly for our customers.

What needs improvement?

Datadog could be improved if it could detect other software in a container or server. Datadog is better than other APM or observability tools, but it focuses mostly on telling the customer what they need to know about the software, database or applications that land on the server. We also need to know the version before setting up an agent with the APM modeling tool.

In some instances, the owner of a particular software changes to another person and this person did not originally transfer the knowledge or data to manage the server. The new person needs to monitor this server and they need to know what software or version of software was installed on this server before they used the APM agent for monitoring. If datadog could provide this insight, it would improve how we use the solution. 

In a future release, we would like to be able to complete a network traffic or network flow analysis to detect the errors or problems on the network.

For how long have I used the solution?

I have been using this solution for two years. 

What do I think about the stability of the solution?

This is a stable solution. 

How was the initial setup?

The initial setup was straightforward. We needed two engineers for the deployment.

What's my experience with pricing, setup cost, and licensing?

This solution is budget friendly.

What other advice do I have?

Overall, Datadog is a good product to use and is easy to deploy.

I would rate this solution a nine out of ten. 

Which deployment model are you using for this solution?

Public Cloud
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
PeerSpot user
Reviewer 76 - PeerSpot reviewer
Vice President of SaaS Infrastructure at a tech services company with 51-200 employees
User
Enhances efficiency with robust alerting and visualization tools
Pros and Cons
  • "The real-time data helps us make informed decisions and optimize our operations, ultimately enhancing our overall efficiency and performance."
  • "The pricing model can be quite complex and might benefit from more flexible options tailored to different organizational needs."

What is our primary use case?

Our primary use case for Datadog is to monitor and manage our fully cloud-native infrastructure. We utilize DataDog to gain real-time visibility into our cloud environments, ensuring that all our services are running smoothly and efficiently. 

The platform’s extensive integration capabilities allow us to seamlessly track performance metrics across various cloud services, containers, and microservices. 

With Datadog’s robust alerting and visualization tools, we can proactively identify and resolve issues, minimizing downtime and optimizing our system’s performance. This has been crucial in maintaining the reliability and scalability of our cloud-native applications.

How has it helped my organization?

Datadog has significantly enhanced our organization’s operational efficiency and reliability. By providing real-time visibility into our cloud-native infrastructure, Datadog enables us to monitor performance metrics, detect anomalies, and resolve issues swiftly. 

The platform’s robust alerting system ensures that potential problems are addressed before they impact our services, reducing downtime and improving overall system stability. Additionally, Datadog’s comprehensive dashboards and reporting tools have streamlined our troubleshooting processes and facilitated better decision-making.

What is most valuable?

The most valuable feature of Datadog for our organization has been its real-time monitoring capabilities. This feature provides us with instant visibility into our cloud-native infrastructure, allowing us to track performance metrics and detect anomalies as they occur. The ability to monitor our systems in real-time means we can quickly identify and address issues before they escalate, minimizing downtime and ensuring the reliability of our services. 

Additionally, the real-time data helps us make informed decisions and optimize our operations, ultimately enhancing our overall efficiency and performance.

What needs improvement?

While Datadog has been instrumental in enhancing our operational efficiency, there are areas where it could be improved. 

One area is the user interface, which could be more intuitive and user-friendly, especially for new users. 

Additionally, the pricing model can be quite complex and might benefit from more flexible options tailored to different organizational needs. 

For future releases, it would be beneficial to include more advanced machine learning capabilities for predictive analytics, helping us anticipate issues before they occur. 

More third-party tools would also be valuable additions.

For how long have I used the solution?

I've used the solution for six years.

What do I think about the stability of the solution?

DataDog has proven to be a highly stable solution for our monitoring needs. Throughout our usage, we have experienced minimal downtime and consistent performance, even during peak traffic periods. The platform’s reliability ensures that we can continuously monitor our cloud-native infrastructure without interruptions, which is crucial for maintaining the health and performance of our services.

What do I think about the scalability of the solution?

DataDog’s scalability has been impressive and instrumental in supporting our growing cloud-native infrastructure. The platform effortlessly handles increased workloads and scales alongside our expanding services without compromising performance. Its ability to integrate with a wide range of cloud services and technologies ensures that as we grow, DataDog continues to provide comprehensive monitoring and insights.

How are customer service and support?

Our experience with Datadog’s customer service and support has been exceptional. The support team is highly responsive and knowledgeable, providing timely assistance whenever we’ve encountered issues or had questions. 

Their proactive approach to offering solutions and guidance has been invaluable in helping us maximize the platform’s capabilities.

How would you rate customer service and support?

Positive

How was the initial setup?

The setup is straightforward.

What about the implementation team?

We handled the setup in-house.

What's my experience with pricing, setup cost, and licensing?

The pricing model can be quite complex and might benefit from more flexible options tailored to different organizational needs.

What other advice do I have?

One area is the user interface, which could be more intuitive and user-friendly, especially for new users.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
reviewer2002326 - PeerSpot reviewer
API Developer at a tech services company with 501-1,000 employees
Real User
Good monitoring, logging, and alert features
Pros and Cons
  • "Thanks to the logs, we manage to make better reports through Jira and also to trace the request with more facility than we would be able to do otherwise."
  • "When the logs are too big, and Datadog splits them, the JSON format breaks and it is not so useful for us."

What is our primary use case?

We use the solution for monitoring, logging, and alerts. 

Thanks to Datadog, we report errors using the logger integrated into our services, which is crucial since we only do unit tests. The infrastructure team handles the monitoring part, so I can't give more insights about that. I am an API developer, so I use Datadog mainly for logging.

The alerts are connected to Microsoft Teams in a specific channel, and we pay a lot of attention to it, and we usually create tickets based on these alerts.

How has it helped my organization?

Thanks to the logs, we manage to make better reports through Jira and also to trace the request with more facility than we would be able to do otherwise. 

Since there are many teams in my company, the fact that we can share the trace of an error, for example, together with all the information about the log, we are able to save a lot of time when it comes to communication between everyone.

What is most valuable?

The most valuable feature for me so far is logging. We do not do integration tests, so we rely a lot on tracing all the requests and we report errors to different teams in the company together with logs that we take from Datadog.

Since I am an API developer, I do not use so much with the other features. Also, I have been in the company for only four months. I have only worked with monitors and alters.

I value tracing the request and being able to tell other teams which component, service, or line of code has an issue.

What needs improvement?

Since I have only been in the organization for four months, I only worked with the log, alerts, and monitoring. I do not have so many insights to share about what can be improved.

I am not an expert user, and not even an intermediate user yet. Rather, I am a beginner.

That said when the logs are too big, and Datadog splits them, the JSON format breaks and it is not so useful for us.

For how long have I used the solution?

I've used the solution for four months.

Which solution did I use previously and why did I switch?

I did not previously use a different solution.

What's my experience with pricing, setup cost, and licensing?

I will get informed about this, I have no idea about costs as an API developer. But I get curious about it

Which other solutions did I evaluate?

I did not evaluate other options previously.

Which deployment model are you using for this solution?

Hybrid Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Google
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
reviewer1996905 - PeerSpot reviewer
VP, Application support at a financial services firm with 10,001+ employees
Real User
Good service catalog and dashboard but the application performance monitoring module needs more functionality
Pros and Cons
  • "The service catalog helped improve our organization by giving a good view of the flow for our microservices applications."
  • "The dashboard could be improved. It would be helpful to get a view of specific things that we need to monitor for our application."

What is our primary use case?

We primarily use the solution for the service catalog.

We use this type of offering for our Microservices applications, and it gives a good view of flow. It is a must when we have different developers working on different services.

Having the trace and log features are useful for locating the microservice for the on-call person.

We would like to see some more useful applications for health monitoring where we can customize the cases based on data from the database.

It needs to have the facility to monitor data inside tables and the status of the UI.

How has it helped my organization?

The service catalog helped improve our organization by giving a good view of the flow for our microservices applications. It's important when we have different developers working on different services and having the trace and log features help the on-call person locate the microservice.

The application performance monitoring has also been useful. This module had a few functionalities that we needed for the application health check. This needs to have some more features to consolidate the view in one tree. We may need more of a one-stop shop on top of the dashboard, and that is missing in Datadog. We'd like to be able to scrap our existing monitoring tool.

What is most valuable?

The service catalog is very useful. We use this type of offering for our Microservices applications, and it gives a good view of flow. It is a must when we have different developers working on different services. Having the trace and log features have been useful in order to locate the microservice for the on-call person.

The dashboard is great. It is helpful to get a view of specific things that we need to monitor for our application. It has been a good way to watch specific things and add them together.

The application performance monitoring is an excellent aspect. This module had a few functionalities that we needed for the application health check. This needs to have some more features to consolidate the view into one tree, however.

What needs improvement?

The dashboard could be improved. It would be helpful to get a view of specific things that we need to monitor for our application. However, it was a good way to watch specific things and add them together.

The application performance monitoring module had very few functionalities that we needed for the application health check. This needs to have some more features to consolidate the view into one tree.

For how long have I used the solution?

I've used the solution for one month. 

Which solution did I use previously and why did I switch?

We previously used ITRS Geneos.

What other advice do I have?

We are using the latest version of the solution. 

I'd rate the solution seven out of ten. 

Which deployment model are you using for this solution?

Hybrid Cloud
Disclosure: My company has a business relationship with this vendor other than being a customer: Provider
PeerSpot user
reviewer2004186 - PeerSpot reviewer
Senior IT Manager at a financial services firm with 1,001-5,000 employees
Real User
Good tags, easy integration, and increases visibility
Pros and Cons
  • "The full stack of integrations made it easier to monitor the different technologies and platform providers, including Software as a Service providers, that otherwise would need a lot of work and customization to be able to see what is happening."
  • "The product could be improved by providing remote control to agents, enabling them to execute automation and collections without requiring another automation tool or integration."

What is our primary use case?

The main use cases are to provide visibility to costs for each product in the company as well as to consolidate all the observability in one tool. We are moving the team from being an operational team that needs to keep the tool up and running (applying patches and resolving problems) to a team that is focused on providing meaningful visibility of the systems, applications, and services of the company. We want to add value where the developers and the systems administrators are not able to focus.

How has it helped my organization?

The organization changed from having a team to operate different tools and providers to being a team worried about enabling and creating different dashboards, alerts, and automations in order to reduce downtime and increase the visibility of all the products, systems, and applications used. 

We moved from a full operation team to a team that adds value to IT, finance, product, back office, and any other team that requires correct information about the services provided while providing the possibility for them to create their own views and dashboards.

What is most valuable?

The tags are quite useful. They are providing the capability to give meaning to on-premises hardware (since it was not possible outside of cloud solutions and containers) as well to tag traces and logs. 

The full stack of integrations made it easier to monitor the different technologies and platform providers, including Software as a Service providers, that otherwise would need a lot of work and customization to be able to see what is happening. We'd also need to use several other separate tools that would require an increase in the required staff to operate them. Datadog gave us the opportunity to have a single platform for observability.

What needs improvement?

The product could be improved by providing remote control to agents, enabling them to execute automation and collections without requiring another automation tool or integration. 

Also, there is a lot of space for the FinOps discipline. For example, it could potentially provide better and richer information for the teams to check the costs and optimize the product.

For how long have I used the solution?

I've used the solution for one year.

What do I think about the stability of the solution?

The stability is very good even though we have had some minor problems recently.

What do I think about the scalability of the solution?

The scalability is very good. We've had no problems until now.

How are customer service and support?

Technical support is good. That said, we had some cases that needed to be escalated to get to a faster resolution.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We previously used AppDynamics. The tool was not providing good system visibility as it was limited and had a very high cost.

How was the initial setup?

The initial setup is somewhat complex. There is a need to create a new automation to install and deploy agents that needs to consider the required security for a financial company.

What about the implementation team?

We handled the implementation in-house.

What was our ROI?

The ROI is still being calculated.

What's my experience with pricing, setup cost, and licensing?

Users need to be aware of licensing control. With autodiscovery, the product can begin to come at a high cost.

Which other solutions did I evaluate?

We also looked into Splunk, ELK, and Dynatrace.

Which deployment model are you using for this solution?

On-premises

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
reviewer2002896 - PeerSpot reviewer
VP at a financial services firm with 10,001+ employees
Real User
Good monitoring, dashboards, and flame graphs
Pros and Cons
  • "The most valuable aspect is the APM which can monitor the metrics and latencies."
  • "The correlation between the logs and the metrics needs improvement as most cases, we might use another logging tool (that is cheaper in cost) which we then have to link together."

What is our primary use case?

The product is used for APM solutions for the metrics and traces for the REST API requests and service maps to understand the upstream and downstream services.

We are creating dashboards and widgets to monitor the status. We are creating alerts and monitors as well. We integrated the alerts and ticketing system in our organization with SNOW and Netcool.

We are using Kubernetes, AWS, and infrastructure metrics. We are using Kafka and Aurora Postgres logs as well, and we are using HTTP status codes to identify the error types.

How has it helped my organization?

So far, the solution works very well and solves most of the problems we have. Currently, we are trying to integrate the trace ID into Datadog and correlate the logs and metrics. However, Datadog is not supporting the spring-generated trace IDs, and they are not shown in the Datadog UI. It works in reverse. This means Datadog injects the DD-specific trace ID into the application logs, and those logs can be in other tools, for example, Cloud Watch and Splunk. 

What is most valuable?

The most valuable aspect is the APM which can monitor the metrics and latencies. There's a low error rate, and any alerts can be tagged to the service requests and sent via email to the required DLs. 

We can create incidents as well in our internal tools, like SNOW and Netcool.

The monitoring enables different dimensions of metrics to monitor the services and infrastructure. 

We have cloud infrastructure monitoring in Kubernetes nodes, pods containers, and ingress metrics.

Alerts are sent to an email in case of any issues. The metrics are used to create alerts.

The solution offers good dashboards, service maps, traces and flame graphs, HTTP status codes, power packs, service catalogs, and profiling.

While the logs module is not activated, we are using all other modules.

What needs improvement?

The correlation between the logs and the metrics needs improvement as most cases, we might use another logging tool (that is cheaper in cost) which then we have to link together. 

They can improve the SSO logging as well. Currently, we are logging in every two to three days by sending the login link explicitly.

For how long have I used the solution?

I've been using the solution for two years. 

What do I think about the stability of the solution?

The stability is awesome. 

What do I think about the scalability of the solution?

We are expanding beyond observability right now.

How are customer service and support?

They offer pretty awesome customer support.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We did not previously use a different solution.

How was the initial setup?

The initial setup was easy.

What about the implementation team?

We implemented the solution with the help of a vendor team.

What was our ROI?

I'd rate the ROI ten out of ten.

What's my experience with pricing, setup cost, and licensing?

I would recommend Datadog to others.

Which other solutions did I evaluate?

We also evaluated ECE and Splunk.

What other advice do I have?

The solution has a great support model.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Download our free Datadog Report and get advice and tips from experienced pros sharing their opinions.
Updated: January 2025
Buyer's Guide
Download our free Datadog Report and get advice and tips from experienced pros sharing their opinions.