I'm not sure which version we're using, although I believe it to be the latest.
We essentially use the solution in advance of performance testing, performance monitoring, and troubleshooting.
I'm not sure which version we're using, although I believe it to be the latest.
We essentially use the solution in advance of performance testing, performance monitoring, and troubleshooting.
The solution affords us many uses when it comes to troubleshooting and application. Should we encounter issues while troubleshooting under load, we can take advantage of Datadog usage metrics to see which method or area is having difficulty, in order that we may resolve the issue.
The solution has valuable troubleshooting and instrumentation features.
The setup was a bit complex.
As Datadog is a bit on the expensive side, I would recommend it for simple, uncomplicated, solutions.
I have been using Datadog for nearly four or five years.
The solution is sufficiently stable.
As our company does not have many users who are making use of the solution at present, I have not encountered issues with its scalability.
I do not have plans to increase the usage at the moment.
I cannot comment on tech support, as I have not made use of them, although my team members have. It seems okay.
We mainly work with Datadog, although with Dynatrace, as well.
The initial setup was a bit complex.
We hired a separate team to handle the implementation.
At present, not much staff is needed for the deployment and maintenance.
I am not in much of a position to address the question of return on investment, although I believe it is for the better that my organization employs the solution.
I do not have knowledge of the licensing costs.
As Datadog is a bit on the expensive side, I would recommend it for simple, uncomplicated, solutions.
We are just the users of the solution. There are not many of us in our company.
The solution is good for complex integations which have lots of downstream and involve multiple combined systems. This is where it is useful.
I rate Datadog as a nine out of ten.
I am using Datadog for error reporting.
I have found error reporting and log centralization the most valuable features. Overall, Datadog provides a full package solution.
I have been using this solution for approximately three years.
The solution is stable.
I used to process a million requests easily with Datadog, I never had any scalability issue.
I have not needed to contact support.
The initial setup is easy.
Datadog does not provide any free plans to use the solution. When I start with a proof of concept it would be sensible to have a free plan to test the tool and check whether it fits the requirements of the project. Before the production stage, it is always good to have a free plan with some limited features, number of requests, or logs.
I rate Datadog a nine out of ten.
We primarily use DataDog for performance and log monitoring of cloud environments, which include VMs and Azure Services like Azure compute, storage, network, firewall, and app services via event hubs.
Alerting based on monitors via teams and PagerDuty.
Logs collection for Azure services like Azure database, Azure Application Gateway, Azure AKS, and other Azure services.
Custom metrics using a Python script to collect metrics for components not natively supported by Datadog.
Synthetic testing to ensure uptime and browser tests via CI/CD pipeline.
Datadog has improved our visibility into infrastructure topology and performance. It provided a simplified view and ability to drill down to system performance, process usage, and logs.
We were able to set up monitors for infrastructure and applications, as the metrics were readily available in the platform. Fine-tuning monitors is very easy and the ability to configure monitor alerts with details on how to resolve the alert is a key value add.
Integration with PagerDuty, teams ensure timely alerting. PagerDuty integration bring tags from Datadog to PagerDuty, which is very useful in routing incidents to the right service
The Host Map, Live Process provides performance metrics of our application. The support team likes using Datadog for identifying resources affected and obtaining the logs.
Monitors are easy and quick to setup. Metrics are easily accessible and quick to use. The ability to send notifications based on metadata from the monitor is helpful. The setup for monitors is one time and it works for all workloads, whether it is Azure or any other cloud.
Logs rehydration helps us archive and rehydrate logs as we need. We don't need logs to be indexed at all times. Logs are required only for escalations and rehydrating does the job and provides cost savings.
We need the ability to create a service dependency map like Splunk ITSI. We have to build this in PagerDuty and it's not the best user experience. The ability to create custom inventory objects based on logs ingested would be a value add. It would be better if Datadog makes this a simple click and enable.
It would be helpful to have the ability to upgrade agents via the Datadog portal. Once agents are connected to the Datadog portal, we should be able to upgrade them quickly.
Security monitoring for Azure and Operating System (Windows and Linux) are features that need to be addressed.
Dashboards for Azure Active Directory metrics and events should be improved.
We have been using Datadog for more than six months.
Stability-wise, it has been good.
The scalability is good so far.
Support team has been very responsive. Only complain is on issues they don't understand, they should have a quick call and unblock the customer.
We didn't have a solution in place. The only thing we had were logs.
Setup is hassle-free and pretty straightforward.
I deployed it myself.
No returns yet. We are in growth mode. If this becomes expensive we may have to look at alternative options.
The cost is high and this can be justified if the scale of the environment is big.
Datadog needs to provide better pricing for large customers.
Prior to implementing Datadog, we evaluated Splunk.
Overall, the Datadog product is really good.
It doesn't need a sales team and yet, the sales team has screwed up on some occasions. It's a great product and the customer success needs to put an extra effort to help customers with best practices rather than passing them off to support.
Customer success doesn't evangelize product features and the customer doesn't know what new is coming unless they ask about it.
The primary use case is application monitoring. We also use it set custom metrics and watch our AWS metrics, as well as data.
At my current job, I have only use it a couple months. However, I used it for a few years at a previous company.
It lets us react more quickly to things going wrong. Whereas before, it might have been 30 minutes to an hour before we noticed something going on, we will know within a minute or two if something is off, which will let us essentially get something back up and running faster for our customers, which is revenue.
Its most valuable feature is the monitoring, such as all the custom metrics that Datadog imports from AWS. In addition, the specific monitoring where you can set up an alert to a bunch of different services.
Some of their newer solutions are interesting, like their logging, but they are not fleshed out. They could use more metrics or synthetics, which would be really helpful.
I would love to see support for front-end and mobile applications. Right now, it is mostly all back-end stuff. Being able to do some integration with our front-end products would be awesome.
It is very stable. Both times that I have worked with Datadog, we haven't had any issues with them going down. Or, if they did, we didn't know, which is good.
At the previous company that I worked at, we threw a lot at them all at once.
Because this is a newer integration, we are putting less stress on the tool. We are still working on integrating it into our platform.
It has scaled great. I haven't run into any problems anywhere that I've used it. They have handled everything that we have needed them to.
We are a 100 person company with 20 engineers.
The technical support is great. They respond quickly. They know what they are talking about and dig right in. If they don't know the answer, they can get it to us very quickly.
The integration and configuration through AWS was pretty smooth. It was easy to set up and start using. The documentation was clear. So, it worked really well.
We did the integration and configuration through AWS ourselves.
We haven't seen ROI at my current company. The solution is too new.
At my last company, we did see ROI, specifically around response time. We could get to mission critical things that were down and losing revenue on immediately. So, the product paid itself back.
The pricing and licensing through AWS Marketplace has been good. It would be nice if it was cheaper, but their pricing is reasonable for what it is. Sometimes, for their newer features, they charge as if it's fully fleshed out, even though it is a newer feature and it may have less stuff than their other items. So, if they would scale the pricing appropriately as they add more stuff to it, that would makes sense. The pricing should reflect the abilities of the features.
We looked into self-hosting something, like Prometheus. We also evaluated New Relic.
We chose Datadog for its ease of use in getting set up and what they offered us.
Take the time to explore it and see all the metrics which are available. The metrics make the reporting better. Spend the time and learn the metrics. The things that they can send and give you are good. Learn how to aggregate them and how to write more complex queries, which they do a good job of showing how to do, but I found that newer people don't do this. They just try to use the baseline set of features. Doing the more complex stuff adds significant value.
We have PagerDuty integrated with it, as well as all of AWS. Those are the big ones we have running through it. It integrates well. It essentially replaces CloudWatch, so we can just use Datadog, which is nice. The biggest thing that they provide is putting everything in one spot.
I have just used the AWS version.
We can build dashboards as fast we roll out new systems, which can be fast.
We use standard and custom metrics for every new system we roll out for 360 degree visibility into our systems.
The most valuable features have been: Sharable dashboards, TimeBoards, dogstatsd API, Slack Integration, Event logging API. CloudTrail Events, Tags, alerts, and anomaly detection. EBS Volume Snapshot Age, which they added upon request. We used PagerDuty integration for a while as well.
More granular control over dashboard sharing. Timeboard sharing.
There are infrequent hiccups, which have been decreasing over the time we have used it.
No.
Customer Service:
Never seen better. Questions answered usually almost immediately, even on weekends. An in-stream with your event stream.
Technical Support:
High.
Overall they have always had an amazing team, and quality has been maintained as the company has grown.
Complementary to other tools we used.
Setup is generally easy. They provide an large number of integrations, some are more complex than others, which is to be expected.
In house implementation.
We didn’t calculate explicitly, but as we used the product to track down underutilized instances, it more than paid for itself in the first month.
Pricing overall in this segment has standardized in the last several years.
A few, including Zabbix and Icinga.
One of the fastest and most flexible tools we have used in this area..
We primarily use the product for tracing, metrics, and alarms in various deployment environments.
The product has provided our company with improved observability, which has helped make the incident response more targeted and quicker.
APM is great and has provided low-effort out-of-the-box observability for various services.
Monitors are helpful, and definitions are simple.
Terraform support is nice as it allows us to create homogenous monitoring environments in various deployment environments with little additional effort. It also facilitates version control of monitor definitions, etc.
The Golang profiler is generally good with the exception of delta profiles; it has provided helpful observability into Heap Allocations which has helped us reduce GC overhead.
Delta traces on the Golang profiler are extremely expensive concerning memory utilization. In a Kubernetes environment where we would like to set per-pod memory allocations as low as possible, the overhead of that profiler feature is prohibitive. In one case, our pods (which were provisioned to target 250 MB and max at 500 MB memory) got stuck in a crash loop due to out-of-memory, which was caused entirely by the delta profiles feature of the profiler.
Multistep Datadog synthetics lack the feature of basic arithmetic. For our use case, performing basic arithmetic on the output of previous steps to produce input for subsequent steps would be extremely useful.
I've used the solution for nine months.
We use Datadog to monitor our Kubernetes clusters.
We have 3 different clusters for different parts of the SDLC. We run the Datadog agent DaemonSet as well as the Datadog cluster agent. Our services have the APM installed by default.
To create monitors, we use Terraform. This is provided out-of-the-box for our service owner.
We run EKS on top of K8s, therefore, we also make use of some of the AWS monitoring capabilities that can be integrated into Datadog.
We are hugely reliant on Datadog for all aspects of our system.
With Datadog, we were able to gain observability in our system.
The installation step is pretty straightforward.
It's easy to use by non-DevOps users. For instance, our engineers do not interact with K8s often; therefore, it is hard for them to debug. However, with Datadog, they are able to view their containers and deployments with a single click.
We also heavily use the tags to help us identify who the service owners are. This is super useful when we need to track owners for patching or pick up new features we implemented.
The APM and K8s monitoring are the most valuable aspects of the solution. The K8s monitoring allows all customers to view their infra, even if they do not use K8s daily. They can just click on a few tabs to get all of the information they need.
It is also very easy to install on our system. APM has helped debug applications on our system as well. We were able to view why a service has suddenly shut down.
We also use Datadog for SLOs/SLAs as well. We check the live endpoint of services to ensure they are still up and running.
There is not much that needs to be improved.
The UI is super user-friendly. The deployment process is easy. We enjoy using the integrations with Slack and PagerDuty.
Customer support is awesome from our experience. There is a lot of documentation for us to be able to use if we need to.
I'm not sure if Datadog can monitor K8s deployments in real-time. For instance, being able to see a deployment step by step visually. This would be helpful if there were any incidents during the deployment.
In general, Datadog is a great solution.
I've used Datadog since I joined my company about a year ago.
We haven't had issues with the stability.
The scalability is really great.
We've had no issues with the product or support.
The initial setup is super simple, and the documentation was helpful.
We managed the initial setup process in-house.
We've witnessed ROI in our DevOps.
We use Datadog to monitor our product on the cloud.
The performance of Datadog is good.
Datadog has a lot of features kind of cramped into one dashboard. It's quite hard to get around what feature does exactly what. There was a steep learning curve, trying to navigate through menus.
The menu navigation could improve. If there was a more straightforward way of adding new functions or features to where each menu is placed that would be an improvement.
I have been using Datadog for approximately six months.
The solution is stable.
The scalability of Datadog is pretty good. However, we haven't arrived at the level that we need to test out the scalability.
We have five people using the solution.
I have not used the support from Datadog.
We have not used other solutions previously.
The initial setup is straightforward, and the process took approximately three days.
We have approximately six people that implement and support the solution.
Sometimes it's very hard to project how much it will cost for the monthly subscription for the next month when you add certain features. Having better visibility of the cost would give a better experience.
There are not any additional costs for the use of this solution.
I would recommend this solution to others.
My advice to others is to use a few features of the solution before going full scale.
I rate Datadog an eight out of ten.