What is our primary use case?
We are using the infrastructure and app monitoring side, such as process monitoring. We are using it in a very traditional way. We are not using the APM capabilities. When it comes to something like containers, we will generally use it on the host but not inside the container itself.
We are using it with our customers and in-house day-to-day.
How has it helped my organization?
It provides more cloud data. They tend to just get the way a service would be designed on the cloud. Datadog can handle a server disappearing and account for it, but they will kick somebody out.
The ease with which we can filter, use metrics, and give accounts to customers, then let the customer filter, set up metrics, and alerts. This has been a big win for us. This can't be done with a lot of the other platforms. This has made things considerably easier. Where we used to get "What's my performance?" Here, have access. Go nuts. Tell us if you need it. Now, our customers no longer ask us for all that, as they want to go do it themselves. This has made our lives infinitely easier.
What needs improvement?
The only thing that they were missing that has throw us from the beginning (they are still missing it) is consistency in the APIs. There are a couple of guys on the automation side who complain rightfully over how hard it is because every new feature which comes out has a new way of interfacing with the API. This was our big, red flag in the beginning, but given the price and other features, it wasn't enough for us to discount. We said "That we would live with this one red flag", but it is still a red flag.
Stability of the product has been a concern for us outside of the primary monitoring agents.
It does not have the best interface.
For how long have I used the solution?
Three to five years.
What do I think about the stability of the solution?
We haven't noticed any issues in the primary use case for which we are using it.
The reason we're not using or looking at the APM space right now is due to platform availability. Datadog doesn't support enough platforms, which they know. Every customer that we have is running PHP, and we cannot use APM with any of our customers because of that. Even if they are 95 percent running Java, if Datadog doesn't have PHP, we can't use it because it won't integrate.
What do I think about the scalability of the solution?
Scalability has not been a concern at all. We have had customers with steady state loads: low and high. Our smallest customer is a friends and family startup which has about three instances. We have steady state loads which are more than 500. Then, we have customers with two instances all summer, but do seasonal work in the winter and can scale to more than 1000 instances.
We have never noticed a hiccup on Datadog with any of our scaling. It has always grown to meet our program.
How are customer service and technical support?
We have used technical support for certain integrations. We use a lot of Ansible and Chef, and we have had a lot of problems with both of these automating components. Technical support was helpful within their limitations.
Which solution did I use previously and why did I switch?
We switched when we started getting heavy into the cloud. We used to use ScienceLogic, New Relic, AppDynamics, Zabbix, etc. It was hodgepodge.
We were very strong in the APM space. We had all of our APMs going through AppDynamics, which suited a lot of our customer use cases in the cloud. However, when our customers started to get more specific, they wanted traditional core monitoring and the other on-premise traditional vendors, like ScienceLogic, weren't cutting it. That is when we started to look at Datadog. We went back and forth for a while between Zabbix and Datadog. In the end, Datadog won out based on feature price and everything together.
How was the initial setup?
The integration with the AWS environment has been pretty seamless. There have been a few services that we don't use that they don't have book support for. However, usually that happens when it is a new service which is really unpopular. Most of the time, our customers shouldn't have been using that service to begin with, since it's a legacy thing that we inherited. I can't think of a single case where we haven't told the customer "You have to get off of that."
What was our ROI?
It has saved us a lot of trouble in implementation.
What's my experience with pricing, setup cost, and licensing?
The pricing came up a bit compared to their competitors. It is not that the price has risen, but that the competitors have gone down. They keep adding more features that I would have expected to be baked in at a more nominal price. I have been increasingly dissatisfied with the pricing, but not enough to jump ship. It is still pretty good.
What other advice do I have?
Check the APIs very carefully. Without fail, this is the single biggest complaint for automation and operations. It is not that it can't be done. Just make sure that you have the technical expertise to work around it.
We use a mixture of both AWS and on-premise. There are actually three scenarios:
- Some of our customers purchase it for AWS.
- Some of them were accounts that we set up directly on Datadog for our customers.
- In some cases, customers already have a relationship with Datadog.
Those are the three scenarios. Some have a mixture of scenarios due to regulatory reasons.
Disclosure: My company has a business relationship with this vendor other than being a customer: Reseller.