What is our primary use case?
There are multiple use cases, which include heat maps, glass tables, and predictive analysis.
The first one is mainly related to heat maps. For example, if you want to monitor the health of a server, you can prepare heat maps for that. When you set up any kind of alerts, they can get missed because people are too busy to check their emails. With these heat maps, the color changes automatically. The Cron job runs behind the scenes, and you don't need to run them manually.
You can also set up a glass table in ITSI for the architecture. For example, a setup like Amazon would have web services, databases, queues, and other things. For the purchase and other things, it has to connect to the external world, so you need to place the complete architecture over there, and you can assign the threshold value. If there is an issue with any of the points, for example, there is an issue with the connectivity of the database, the heat maps would change in color, which helps you to easily identify that there is an issue.
It also has a concept called predictive analysis. For example, your WhatsApp chat backup happens every 24 hours or 7 hours, but you cannot predict how much bandwidth it's going to use during the backup. It might even use 100% of the bandwidth. You cannot set a proper threshold. In such cases, you can use predictive analysis. It'll analyze the data patterns, and based on the data pattern, it predicts if everything is good or if something is bad. It can predict if something is going to fail.
You can have an integration with the ticketing tools. For example, if something happens on any server or PC and you've directly integrated the tickets from Splunk to ServiceNow, it's automatically going to create a ticket in ServiceNow.
There's also a concept of episode review wherein it groups the alerts so that there's no ticket spam in ServiceNow. For example, if you are monitoring a server and it's down, there might be 10 to 20 alerts, which would create 10 or 20 separate tickets and spam your ticketing system. In such cases, you can use the episode review feature. It will merge all those tickets into one and include all the details in that.
How has it helped my organization?
Splunk ITSI allowed us to monitor the health of servers. We can also completely monitor an application and identify data patterns. Automation of ticketing tools can also be done with this. We can also do log monitoring with Splunk ITSI.
It's also helpful for developers. When they create an application, if there is an issue in their code, based on the output data, a request is automatically triggered to the engineering team stating that there is an issue with the code.
The visibility into an application is very good if you configure everything properly. You first have to analyze the application by using any of the monitoring tools such as Elastic, Splunk, etc. You have to analyze the application in and out, and afterward, you have to place the monitors in particular places for end-to-end visibility. For example, in the case of a home security system, to completely secure the home, you have to place the devices in a proper place. Until and unless you place the devices in a proper place, you cannot say that it's completely secured. If you are not keeping the cameras at the main entrance and the windows, or you haven't placed them properly, you can't say that the home is properly secured.
Splunk ITSI is very good for predictive analytics for preventing incidents before they occur. For everything, there are patterns, and based on the algorithm, you are allowing the machine to analyze the data and predict whether the data patterns are coming in a proper way or not. Splunk analyzes the data patterns based on the historical information that we give it. After analyzing the historical information, it creates triggers. If the data that we are feeding into the machine is incorrect, it's not going to work the same way.
There's the accuracy of alerts. In Splunk, the data is almost in real-time, so we get tickets in real-time. If there's a failure, we can roll over to the backup applications immediately. It saved about a million euros for one of our clients. They were having an issue with the Symantec antivirus that blocked the complete Citrix environment, so the workers were not able to sign in and access the application, which led to an outage. Within a matter of minutes, Splunk triggered a ticket, and they identified that they were having an issue with this particular antivirus, and they blocked it.
Splunk ITSI has helped streamline our incident management. There is efficiency in terms of clubbing the tickets and sending tickets with meaningful information, so mainly with the alerting system, you can configure as much information as you want using the Splunk monitoring tools. You can send some links in the ticket, or you can send a separate set of guidelines for the engineers on what has to be done. The clubbing of tickets has also helped a lot to avoid spamming.
Splunk ITSI has reduced our mean time to detect. Based on my experience and the feedback from others who are using it, it has saved a lot of time. The time reduction is significant when compared to other tools in the market.
It has reduced our mean time to resolve. Glass tables have been very helpful. With the help of Splunk ITSI, you can place the heat maps and services in place based on the application architecture to easily identify where the issue is coming from.
What is most valuable?
I find the episode review, glass tables, and correlation search features very useful.
What needs improvement?
Microservices is the only area where Splunk ITSI can be improved. When things come from one EC2 instance to another, there's a lack of exposure to microservices, so we can't know what's happening. Apart from that, it's doing pretty well.
For how long have I used the solution?
I've been using Splunk ITSI for five or six years.
What do I think about the stability of the solution?
I'd rate it a nine out of ten in terms of stability.
What do I think about the scalability of the solution?
I'd rate it a nine out of ten in terms of scalability.
How are customer service and support?
It isn't 100% satisfactory for all the cases. About 80% of the time, they are good, and about 20% of the time, they aren't as good. They can be very slow. We also had an incident where we asked them to upgrade to a version, but in that latest update, Splunk had removed some concepts because of price issues. As a result of removing a particular module, our complete environment failed. It took us a day to roll back the version and go back to normal. Overall, I'd rate them a seven out of ten.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
I used VMware vSphere and a CA Technologies tool. We switched to ITSI because the optimization was very less in them. There is also a significant difference in data parsing. We also have real-time data.
How was the initial setup?
At the beginning of my career, I found it to be complex because you need to know a lot of areas, such as network and firewall rules, routing methodologies, and the cluster concept. I kept on learning along with my teammates, and it's pretty good now.
What about the implementation team?
In the beginning, my teammates helped me, but now I don't need any help. Depending on the load and the environment, I can build things.
What was our ROI?
One of our clients was paying two hundred thousand to three hundred thousand dollars for a report based on the complete data, whereas they could also get the data by running a couple of queries from the database. After the implementation of Splunk, we used something called DB Connect. It was a small tweak, and after that, the price was reduced to a hundred dollars or eighty dollars per annum. All they are doing now is creating or running SQL queries, getting the data back in Splunk, and based on that, triggering and sending a report. That's it. It was all about preparing proper monitoring. The data was already available. We prepared the alerts. Along with the alerts, we also prepared dashboards for the users to visually review the historical information for the past one or two years. They can even see the report month-wise. Two hundred thousand dollars to less than a hundred dollars is incomparable.
What's my experience with pricing, setup cost, and licensing?
Its pricing has been changed as per the market. You get a good support service with it as well. They have 24/7 customer support. There is a portal, and if you are having issues, they are available in order to resolve them. So, its pricing isn't too much.
What other advice do I have?
I'd advise learning the tool properly, understanding its capabilities, and utilizing it efficiently. One of our clients was paying hundreds of dollars towards the license, but they were utilizing it only for server monitoring.
To someone who already has an APM solution but is considering switching to Splunk ITSI, I'd say that switching to ITSI is going to help them a little bit more. The grouping of the ticket to the users can be easily planned. It's not rocket science. It's easier compared to the other tools where you need to create a lot of configuration for that. The configuration has been segregated, which makes it easy for the applications team to set up their own monitoring and group them to avoid the number of tickets generated. You also have predictive analysis along with heat maps and glass tables, which aren't available in other APM tools in the market right now.
Overall, I'd rate Splunk ITSI an eight out of ten.
Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.