Folks,
What are your experiences in using Splunk as an Enterprise Class monitoring solution in either the infrastructure or application performance monitoring spaces? How might it compare to a matured (or even not matured) instance of CA's suite inclusive of APM/Wily, CEM, ADA, and UIM?
Looking for insight into the level of granularity of data that can be collected, timeliness of the data, as well as the footprint needed to collect it.
Thanks!
Hi,
Well I will summarize my answer in the simplest possible way.
It all depends on what's your business pains againest your expectations from the solution.
First and foremost Splunk by it's functionality definition is a log analytics tool not an APM solution, as it doesn't provide you with end to end User Experience, in brief no real user monitoring, no code-level monitoring, no machine monitoring as far as I know and no Transaction/Business Analytics.
So again it depends on what you are looking for, but if you're looking for E2E user visibility from aservice availability and performance perspective then Splunk is not the answer.
Hope it helps. Thanks.
Totally agree. Splunk is mainly an IT Ops Analytics solution (log manaegment, event collector, metrics warehouse), but it is not an APM or generally speaking "probe" solution. I'd suggest rather to use splunk as a collector of data coming from several monitoring tools / probes. HIH. Cheers. L
While Splunk is sometimes used for application, network, or server monitoring, primarily via insights garnered from logs, customers looking for insights into applications, server, or network may well be better served with solutions that focus on collecting and making sense of data from those sources.
Take for example, CA APM. The APM solution collects deep performance data of Java, .NET, Node.JS, and more with easily deployed agents that automatically determine the correct metrics to collect. In addition, these solutions can track transactions from the user end point, through application & middleware layers, and right on into backend environments such as the mainframe. By automatically collecting this data, the CA APM solution removes the need for development organizations to retrofit applications to log the appropriate content.
Add to this the ability to manage mobile applications, collect crash data, analyze user session data, and determine application flow data, and the integrated APM and Mobile APM solutions provide a robust picture of your IT Applications.
CA UIM extends this automatic expertise into the server, storage, and infrastructure layers, as does CA ADA within the network. This data can be collected automatically with no, or negligible footprint. Data collected via specialized approaches depending on type can then be fed into an open, powerful analytics engine built on ELK to further understand this data.
Even better..
Took us about 10 minutes to install new relic on a site and another 10 minutes to start collecting information
Their system handles it all and you do nothing much than put a small piece of tag/code in your app
Splunk is more to do with Data Analytic and analyzing the area's of problem in general while correlating the events from multiple sources for same or multiple applications to recognize problem and use that data for log time as a trend. Whereas as a true APM can provide specific information for a particular application and it's related/integrated servers/apps. APM can provide end-user experience, web UI and other problems specific to Application and back-end DB server relationship etc. Calls being made from user to apps server and how they are shaping up to complete the transaction from start to finish e.g. Web calls, app-db calls, db query slowness, call stalled, calls slow, transaction hung/error etc.
Hope this helps.
www.splunk.com
As developers reach down the stack and network engineers stretch upward, they must meet in the middle with visible, integrated data from both ends. You need visibility to everything, and Splunk is that platform where you have access to all that data throughout all.
Given everything new relic does, its much better and why bother with splunk in this case?
I have created external dashboards for upper management
Can track the APM, the Browsers, and Ajax as well as Java or MS server software
You can write your reports against what they collect..
And I have shown upper management their subscribing to ping tests is a waste of money
With new relic I have written small scripts that go to the server, call up a page, try to log in, then log out on a dummy account
This has reported when our consultants have cheated and taken down the server at 3am to change prod code!!!
While the pings said nothing because the server box/instance itself was running fine, they just cycled the service for the application
Thx for my two cents…
I prefer New Relic
Once I saw what splunk was at its core…
Splunk alone can’t be used nor defined as one “primary enterprise monitoring system.”
Splunk is to centralize and analyze your logs. It is capable of generating alerts so I can see how this functionality can be confused with Nagios. But Nagios is an infrastructure and services monitoring and alerting solution. It can monitor things that don't necessarily have logs like cpu usage, number of processes, even check for ssl certificates about to expire. Logs may not tell you that apache has stopped responding to http requests where Nagios can.