I've found the most valuable feature to be--
- Being able to drill down to see data, and
- Being able to capture all the timing information and different functions.
It comes in as part of the regular process for every application roll-out. We have a standard visibility process for any application that rolls out. It gives us the ability to train our people and provide a more responsive application. We used to have many tools with many different functions, and now APM allows us to consolidate a lot of it.
The mapping between applications to servers is not very intuitive.
Another thing we come across is that our technology just doesn’t have reporting to New Relic, but that can be addressed with a plugin/SDK. However, we can’t really make the case to put in the investment to have that happen yet.
Another thing is that we’re micro-service based, and the New Relic interface only gives us views into the top 100 services out of 50,000. Typically when we monitor our system, we use a heat map, and New Relic only provides us the second-level view of that. Ideally, it would also provide us the first-level view. Eventually, we’d like New Relic to step up to do that.
Finally, it should ideally do two things -
- Isolate the problem right away without the user having to do a lot of analysis. Right now, New Relic provides a lot of data points that require me to go in to understand.
- It has its own dashboard, and I’d like to be able to bring that/integrate it into our own system (use an API to pull out data).
Sometimes when we pull data from New Relic, we time-out or drop data, and we can see when that happens, but we're not sure if it’s us or them.
Also, the alerting system has trouble with large alerts that come up slowly, requiring the operator to know the system well (yellow, red, orange) and to know what the alerts mean.
It hasn’t scaled quite right now. We use another tool for out-of-gate view. Currently, we manage about 60,000 servers in total and we don’t have a good roll-up view of the entire system. The application on the server side is OK. We use other tools to monitor the environment.
So far, the interactions have been good, and they keep us in the loop as to what’s been done. In terms of the solution, it’s just OK.
I wasn't involved in the setup.
Engage the development community within the company early, and request an integration tool to make implementation easy.