In my recent projects, we have used ITSI to monitor the entire infrastructure using multiple features, such as service KPIs, aggregation policies, base searches, correlation searches, notable events, dashboards, blast tables, service analyzers, and drill-downs.
Works at a comms service provider with 1-10 employees
No other tool provides you with the same level of observability and enterprise security or the search and reporting applications
Pros and Cons
- "The most valuable feature of ITSI is the service KPIs. No other tool provides you with the same level of observability and enterprise security or the search and reporting applications."
- "ITSI is an almost perfect tool, but there is room for improvement in a few features like the deep dive and multi-KPI alerts. We're using most of the features like service API, coding searches, and aggregation, but our team members hardly use multi-KPI and deep dive. We don't use the multi-KPI or deep dive because everything is available in the service KPI. I don't think this feature is necessary."
What is our primary use case?
How has it helped my organization?
It helps in every respect, including performance, monitoring, or visualization of the important indicators. It improves the quality of service to the clients. It is crucial that the clients have no website failures because that means the loss of business. ITSI helps us track those issues. We've seen fewer environmental failures since we started using ITSI.
We saw immediate benefits from Splunk ITSI. For example, let's say you have a project for monitoring hybrid Linux servers running JBoss, SAP, and any server containing a client's critical data. It isn't easy to monitor each of these through the back end.
Splunk ITSI shows you all the data on the screen and lets you visualize the data from various applications. We can see all the applications running on the server and issues with CPU or memory utilization. We have that data in Splunk and can immediately see the alerts triggered. If there are any failures in the environment, we can fix them in seconds.
The solution has helped us streamline our incident management. We can monitor server KPIs, which trigger an alert if the server is impacted. We can track all the notable events and integrate ServiceNow with Splunk. ITSI is integrated with the ticketing tool, so when an alert triggers, it automatically creates a ticket on ServiceNow.
ITSI has also reduced the alert volume. Before ITSI, we were unsure why an issue happened. We would see the alerts triggered in bulk and log them one by one for every server. ITSI gives you a feature that lets you drill down to find the precise issues on the server.
It has a service KPI feature that allows you to monitor exceptions that may lead to server failure. For example, we might be in trouble if the value exceeds 10. We put five or eight values in the threshold field with a high criticality, so it triggers an alert whenever the count is breached.
ITSI reduced our alert noise because it was very hard to monitor every aspect when we used search and reporting. After running the query, we needed more insights, and ITSI gave us a clearer picture of the incident. That helps you reduce issues.
Many use cases can be automated through ITSI because we previously built our reports manually. After introducing ITSI, we sent all the data via the forwarders to Splunk. Once we have the data, we create and schedule all those queries and reports so that the management can see them without any IT involvement. It previously took us two or three hours daily to create all those reports, so automating reports saves almost 60 hours each month. We're automating 10 to 15 daily.
What is most valuable?
The most valuable feature of ITSI is the service KPIs. No other tool provides you with the same level of observability and enterprise security or the search and reporting applications.
ITSI has everything. We can create searches, email alerts, and dashboards. It's the only application that offers the KPI concept where we can monitor different KPI parameters. We can configure the KPIs to trigger alerts when they breach a set threshold.
You can use the core concepts to optimize performance optimization. And you can create a lot of correlations and onboard the data from every project application. You can play with the data to create those KPI services and crash modes. It's possible to establish service health using the KPIs through the service analyzer. On a single screen, you have a lot of tiles showing you the service KPIs and high-level insights.
When I started working on ITSI, there was some lag in releasing predictive analysis. Since then, there have been several updates, and we see that it works. We can predict any fluctuation in the data that might lead to failure. Using the historical data, we can set up the adaptive threshold. ITSI analyzes the historical data and sets an analysis for the future.
What needs improvement?
ITSI is an almost perfect tool, but there is room for improvement in a few features like the deep dive and multi-KPI alerts. We're using most of the features like service API, coding searches, and aggregation, but our team members hardly use multi-KPI and deep dive. We don't use the multi-KPI or deep dive because everything is available in the service KPI. I don't think this feature is necessary.
People mostly use ITSI to monitor alerts. The most important features are within the service KPI. When we configure the alerts in service KPI, we don't need to do any deep dives because the client is more interested in the raw data, so we run the queries on the raw data instead of going into the deep dive.
Buyer's Guide
Splunk ITSI (IT Service Intelligence)
October 2024
Learn what your peers think about Splunk ITSI (IT Service Intelligence). Get advice and tips from experienced pros sharing their opinions. Updated: October 2024.
814,763 professionals have used our research since 2012.
For how long have I used the solution?
I have used Splunk ITSI for seven years.
How are customer service and support?
I rate Splunk support nine out of 10. It is very helpful. Whether you are connected to priority one, two, or three depends on the issue and its impact. You can also get help from the Splunk community. If you create a P2 ticket, they will reach out to you within an hour and resolve the problem in eight hours. They have different SLAs.
They might take one or two days to resolve issues. We need to upload the tags over the server to the portal. After that, they will start working on it. They have solved all the issues in the last four or five months within two to three days maximum.
How would you rate customer service and support?
Positive
Which solution did I use previously and why did I switch?
We had Dynatrace. It was integrated to onboard the data and create correlation searches to monitor those parameters.
How was the initial setup?
Setting up Splunk ITSI wasn't difficult. A few files needed to be placed over the indexers, and a few more needed to be placed over the license master. I didn't have any issues installing ITSI from scratch. It takes 15 to 20 minutes, depending on the project. It can be set up with one to three people. When service KPIs are installed, we need to validate them after the installation and upgrade ITSI.
Which other solutions did I evaluate?
My friend works with OpenSearch. They are moving from Splunk to Cribl and OpenSearch. Splunk is pretty expensive, but it gives you a decent insight into the data. It is easy to learn, and ITSI has a great interface. You can run those queries and pass the data. I don't find any product attractive, and we need to put more thought into it.
What other advice do I have?
I rate Splunk ITSI nine out of 10. I have worked on multiple projects in the last seven years, and I've never found any product like ITSI. We can monitor everything through that. It's an excellent product.
Setting up and mapping the searches with the aggregation policies can be a little complex. Once you've mastered that, you can do anything with the ITSI. You can monitor the whole project infrastructure. You don't need any other tool to monitor and visualize the data. ITSI is enough.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Last updated: Aug 7, 2024
Flag as inappropriateObservability Platform Lead at a financial services firm with 5,001-10,000 employees
A reliable solution that enables users to build glass tables and set up thresholds
Pros and Cons
- "The glass tables are very helpful."
- "If the product had some prebuilt machine learning features, it would add value to our use cases."
What is our primary use case?
I have used Splunk ITSI to build a lot of glass tables and set up thresholds. We have also used MLTK for machine learning, predictive analytics, and anomaly detection. We use MLTK, which is an external application. We can get notified of issues well before the time to take proactive action.
How has it helped my organization?
We use core Splunk and Splunk IT Service Intelligence. It is a multisided cluster environment. Whenever the customer wants glass tables, notable events, or to set up some alert notifications, the product has helped our organization. We can set up our own threshold activities. We can also add ad-hoc searches in the solution. We can get the data of the indexes and alerts tracking by writing a search query.
What is most valuable?
The glass tables are very helpful. The solution also provides topologies showing exceptions or criticalities whenever something goes down. It is very helpful for customers. The notable events, glass tables, and setting up thresholds are the most valuable features of the solution.
Every customer has a different need and their own customized threshold settings. Some customers need 99% as critical, and some need 80%. We can set the customized thresholds in the product and get the alerts.
What needs improvement?
If the product had some prebuilt machine learning features, it would add value to our use cases. It would be very good if the product had some in-built predictive analytics and future forecasting features.
For how long have I used the solution?
I have been using Splunk for almost four years.
How are customer service and support?
The support depends on the licensing we use. There are different licenses available based on the volume and vCPUs. We use the license based on vCPU. It depends on how many virtual CPUs we use. It would be good if Splunk could give on-demand support.
Whenever we raise a support case, the support team follows the SLA and gives us a response. Sometimes, companies will also have on-demand support based on the support credits. Companies generally expect support persons and engineers to join the Zoom sessions when P1 and P2 issues arise. The support team takes a long time to join the meetings at such times. If we can have an engineer join the Zoom sessions right away, it would be helpful for the customers. The support team needs to respond quickly to P2 issues.
We had a P3-level case with a severity level of S2. It was a corrupt bucket issue. The case was in open status for six months. Generally, we don't need six months to fix a corrupt bucket issue. If the support case had been escalated to a higher-level engineer with advanced knowledge in debugging the issues, it would have been easier and would have taken less time.
How would you rate customer service and support?
Neutral
Which solution did I use previously and why did I switch?
We have been using Enterprise Security. It is for intrusion detection and threat intelligence. It helps our enterprise security team to find vulnerabilities and take proactive actions. We started using Splunk IT Service Intelligence because it gives us some good topology if we build glass tables based on our data. The product provides us with service intelligence.
How was the initial setup?
The deployment process is straightforward. It is the same as core Splunk. The solution uses summary indexing, itsi_tracked_alerts, and itsi_summary_metrics indexes. We must ensure these indexes are available and have a good retention policy.
What was our ROI?
Our customers have seen improvements in resilience and cost.
What's my experience with pricing, setup cost, and licensing?
It would have been good if the product cost was much lower.
Which other solutions did I evaluate?
We chose Splunk over other vendors because it is much more reliable. We have done a POC to test how well the tool can help the customers and provide good value to their business. We have used other products like Elasticsearch and Cribl. However, we feel that Splunk is better. Log monitoring is very important to customers. Other log monitoring tools are not user-friendly and flexible. It is also not easy to write search queries on them. However, it is easy to write search queries on Splunk. It also has bucket lifecycles. It is easier to have a centralized repository to maintain and use the data.
What other advice do I have?
Our clients monitor multiple cloud environments. We get data from different third-party clouds like Google Cloud, Microsoft Azure, or AWS. Sometimes, we also use Snowflake. Customers mostly try to build out their own dashboards and knowledge objects. They use Splunk IT Service Intelligence to be notified about any exceptions or critical issues.
We cannot integrate the product directly with the cloud applications. First, we have to integrate our core Splunk with different clouds. We must first integrate add-ons using Splunkbase, a REST API mechanism, or an HTTP Event Collector (HEC) mechanism into core Splunk. Then, we can use the same ad-hoc search in Splunk IT Service Intelligence to get proper glass tables and results. It's easy to monitor multiple cloud environments using the solution, but we could directly integrate with it if it had the right integration features.
It is important for our organization that the solution has end-to-end visibility into our cloud-native environment. In today's world, most data goes into the cloud. Every organization wants to move the data to the cloud so that it would be more reliable and they can get the data easily. It's less cost-effective as well. So, most organizations are going to the cloud. It's really beneficial and important to the customers because they can easily get the data from the cloud and perform cost optimizations. Managing cloud-native environments with the solution is cost-effective.
The product has definitely helped reduce our mean time to resolve by 70%. If it has built-in machine learning or artificial intelligence techniques, it will be helpful to reduce the remaining 30%.
The tool has helped improve our customer's business resilience. Different SIEM applications and tools are available for enterprise security in today's world. Splunk's next version will have enhanced SOAR features. It will be useful if the product has additional features to help customers and organizations.
We used the MLTK app from Splunkbase and deployed it in Splunk IT Service Intelligence. It helped us to do predictive analysis, forecasting, and anomaly detection. It helped us gain some insights. I rate the tool's ability to provide business resilience a seven out of ten.
If we have a Splunk add-on for Unix and Windows, we can use those add-ons in our core Splunk to get the base monitoring, like OS metrics. For these things, Splunk has PowerShell scripts. It runs every five minutes. So, it is not in real-time. Every organization would need real-time monitoring. The product should provide these features in real time. For OS metrics, we use custom thresholds.
Our customers see time to value within seven days. We implement Splunk with minimal architecture, like two deployment servers, two heavy forwarders, four indexes, and three searchers. We initially had the search factor as two and the replication factor as two. We had very little data initially. We tested in our lower environment with the POC and found the data the customers wanted to see in Splunk. It was helpful for the customers. They can find the exceptions, write their own search queries, and build their own knowledge objects.
We get different types of security management tools in the market, like Enterprise Security, SOAR, and Phantom. The product brings a lot of value to the customers. It gives a lot of insights into notable events and predictive analysis. It also has a good dashboard. I expect the solution to provide enhanced features in the upcoming release.
Attending Splunk conferences provides us with an opportunity to interact and get more details on the products from different vendors. More than 1,000 vendors attend the conferences. The more we interact with the vendors, the more insights we get from them. It is also helpful to build relationships with the vendor.
Overall, I rate the tool an eight out of ten.
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
Buyer's Guide
Splunk ITSI (IT Service Intelligence)
October 2024
Learn what your peers think about Splunk ITSI (IT Service Intelligence). Get advice and tips from experienced pros sharing their opinions. Updated: October 2024.
814,763 professionals have used our research since 2012.
Senior Application Consultant at IBM
Helps reduce alert volume, streamline our incident management, and adds reliability
Pros and Cons
- "I particularly like the preview feature because it provides a prompt experience for impact analysis."
- "Currently, Glass tables in ITSI only display metrics related to KPIs."
What is our primary use case?
I worked on multiple projects using Splunk ITSI for log monitoring, including monitoring mobile data usage for a telecom company, working with an insurance company and a retail application, and monitoring payment applications for a bank.
How has it helped my organization?
The integration with Splunk ITSI allowed us to monitor and track issues through alerts. This integration also reduces the Mean Time to Identify as the team is quickly made aware of problems through the ITSM tool, and respective incidents are raised to the application team. Depending on the issue's type, we can prioritize the incident, even giving it a P1 priority. With this, the team is made aware, and since we track our issues in ServiceNow, related incidents can be deployed, which also helps reduce the Mean Time to Resolve. The application team then knows what actions to take.
Event management utilizes event correlation and event aggregation instead of generating numerous alerts that cause panic within the team because multiple areas might be affected by a single issue. This can be achieved through Splunk's native capabilities, like notable event aggregation policies and episode reviews for ITSM, or by utilizing third-party tools such as Netcool. By employing event management tools like Netcool and then sending aggregated incidents to ServiceNow or using ServiceNow's item model for implementation, the number of alerts is reduced, and the troubleshooting team receives relevant information instead of overloading. This approach helps mitigate panic and provides the team with the resources to effectively address issues.
End-to-end visibility for application monitoring in our use case required us to consider all involved components. We addressed this by creating hierarchical dashboards. This approach provided everyone, from business stakeholders to operations, with visibility into application health through relevant metrics. Business stakeholders, for instance, focus on high-level metrics like application health, user experience, revenue, and performance rather than technical details like CPU usage. Therefore, we tailored the dashboard hierarchy for different roles: business executives, operation leads, project managers, and operations staff. The operations dashboard provided end-to-end visibility by configuring all components of the application's functioning. Leveraging the familiar network architecture, we utilized the same topology to present metrics, creating a comfortable and easily understandable dashboard layout. By plotting all entities with their availability and performance metrics, we achieved comprehensive end-to-end visibility.
We have set up the environment correctly for the predictive analytics, and our metrics are flowing continuously. We have the required data, so we can configure at least 30 minutes of lead time to predict the metrics and their thresholds for potential impact. I can set this up, but I only had the opportunity to work on the project until anomaly detection. Predictive analytics was not a requirement, so I did not implement it. However, I understand it entirely and have explored and learned about it in their documentation.
For our telecom project, we focused on promotions as a use case. We aimed to identify the most popular promotions among users, especially during festivals and special occasions. Analyzing business metrics revealed that Promo Code 350 was the most frequently used, generating significant revenue. We presented these findings to the business team, showcasing how different promotions performed during various events. This information empowered them to design more effective offers and strategies, ultimately improving the customer experience. The business team appreciated our contribution, recognizing the value of data-driven insights in shaping their marketing efforts.
Splunk ITSI is a tool that helps our clients streamline their incident management. By integrating Splunk ITSI with ServiceNow and NetCool, we can reduce the burden of keeping up with the number of incidents and ensure they're updated.
Splunk ITSI helps reduce alert noise. We receive multiple alerts for each event when using any APM tool, Splunk, or log monitoring tool. Aggregating these alerts has always been helpful, and we've utilized Splunk's notable event aggregation policy to reduce alerting for each KPI to a single episode review.
Splunk ITSI reduces our mean time to detect.
Splunk ITSI is resilient and highly capable of tracking issues, provided the necessary logs are configured. With proper configurations, metric values are obtained, allowing us to monitor KPIs and quickly identify any adverse effects. In such cases, we can seamlessly delve into the logs to pinpoint the exact root cause of the issue.
What is most valuable?
I enjoy designing glass tables, hierarchy dashboards, and the preview for ITSI. I particularly like the preview feature because it provides a prompt experience for impact analysis. We can directly track which specific service is impacted and identify the underlying affected entity. Also, we can quickly view the affected metrics. Overall, the Glass table preview is the most valuable feature.
What needs improvement?
Currently, Glass tables in ITSI only display metrics related to KPIs. I proposed adding an option to show metrics related to entities. This would eliminate the need for custom SPL to achieve this functionality. Since KPIs already have an entity split feature, extending this capability to dashboards makes sense.
For how long have I used the solution?
I have been using Splunk ITSI for five years.
What do I think about the stability of the solution?
I would rate the stability of Splunk ITSI nine out of ten.
What do I think about the scalability of the solution?
Splunk ITSI is scalable. It offers clustering for search indexes, and we have the deployment service.
Which solution did I use previously and why did I switch?
I previously used AppDynamics but switched to Splunk after learning about it and finding it more interesting.
How was the initial setup?
The deployment is straightforward.
What's my experience with pricing, setup cost, and licensing?
Splunk ITSI is expensive compared to other tools.
What other advice do I have?
I would rate Splunk ITSI eight out of ten.
Other APM tools have limited features, so I recommend Splunk because it allows you to go beyond pre-built functionalities. With Splunk, you can create custom rules for application monitoring and tailor data visualization for enhanced visibility. Splunk's flexibility extends to designing personalized dashboards and metrics, providing a limitless monitoring experience.
Splunk ITSI requires maintenance for upgrades either annually or biennially.
Splunk is a comprehensive solution that offers log monitoring and the ITSI observability suite, eliminating the need for multiple tools and the associated complexities in maintenance and cross-team coordination. Splunk's flexibility allows for adopting features like APM as needed and seamlessly adding further monitoring capabilities in the future, such as user experience monitoring, synthetic monitoring, or additional log monitoring. This adaptability, along with Splunk's ability to correlate data across different monitoring areas, makes it an ideal unified platform for comprehensive monitoring and observability.
Which deployment model are you using for this solution?
Hybrid Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Last updated: Sep 24, 2024
Flag as inappropriateAIOPS Consultant at AIOPS Consultant
Good compatibility and end-to-end visibility with helpful support
Pros and Cons
- "Customers have noted the solution helps streamline incident management."
- "The license cost is expensive."
What is our primary use case?
We use the solution for intelligence. For example, if I have a website that sells games, it might have a lot of things like databases, servers, et cetera. I can see how many users have logged in, what purchases can be made, and so on. Splunk provides the logs to see all of the data for all actions on the site. I can see things on a technical level, like how CPUs are performing.
I can see things in real-time, and it's based on real data. This is the advantage Splunk has. There is complete visibility and I can monitor KPIs as well.
I can look at how my database looks, how my sales look, et cetera, and all metrics are in one place.
There's machine learning as well, including anomaly detection. You can look at and understand the date very easily. It helps us provide a complete understanding of business so that I can understand anomalies better and watch the daily data. It gives me alerts in which I can take a deeper dive.
I have a ticketing system. If I have a Splunk power user, they can look at the data and create a ticket for future inspection. People can correlate and collaborate on the same ticket.
Basically, everything you need you can find on Splunk. You can also create custom actions.
We can do actions right on the Splunk UI.
What is most valuable?
The compatibility is good.
The end-to-end visibility is okay. The only thing that is lacking is the application monitoring. We struggled with one use case where payments were failing and they couldn't understand if it was the infrastructure or bandwidth. The capability of recording any transaction is not possible in Splunk. You have to write your own scripts, however, it's not as user-friendly.
The predictive analytics are pretty good. I've seen people using it. That said, I'd say the admin needs a deep understanding of the infrastructure. It has a tendency to create noise. If you have a noisy system, when there's an alert, people tend to miss issues.
Customers have noted the solution helps streamline incident management. At a single glance, there is a complete view of infrastructure. It's good for the customer on the technical side. Teams were able to map the availability of the system more accurately - up by 28%.
It's helped reduce alert noise. It can aggregate the alerts and just create an alert only when needed. From the UI, you can correlate the alerts using dynamic conditions (not just static ones).
We've been able to reduce the mean time to detect. It has a similar meantime to detect as Dynatrace. We've used it when there wasn't an existing system, and we would have had similar results with other tools in the market. It's helped with MTTR for sure. Previous to implementing Splunk, the mean time was one hour or so. Once we implemented it, the alert notification was automatically sent to people, so it automatically reduced the time to two to five minutes.
The mean time to resolve has been reduced thanks to Splunk.
What needs improvement?
If you are using Splunk ITS and Splunk Enterprise Security, you have to run different searches. You cannot run both on the same server. You can bifurcate it however you want, however.
The license cost is expensive. When I want a premium application it's extra. I need to pay for this on top of my base license.
We'd like to see more use of artificial intelligence. There's no easy knowledge-base bot. It would help if they had a ChatGPT-like AI that could show them the knowledge base information they could use to address tickets.
For how long have I used the solution?
I've used Splunk as a product for about five years.
What do I think about the stability of the solution?
The solution is stable.
What do I think about the scalability of the solution?
The solution can scale. I'd rate it seven out of ten. There are some requirements on the backend in terms of scaling. If you want extra storage, it will cost more money. If you are adding a new server you will have to go and configure it and then you have to restart everything, so there may be downtime.
How are customer service and support?
I've contacted technical support. They were good in terms of experience. The cloud support is excellent.
How would you rate customer service and support?
Positive
Which solution did I use previously and why did I switch?
I did not previously use a different solution.
How was the initial setup?
You can install the solution on-premises or on the cloud. If you want to send the data to your own on-premises environment, you can do so.
I was involved in the initial deployment. The setup was very straightforward, however, the requirements gathering can be complex, as well as gathering the KPIs and developing an understanding of requirements. You need someone who has a complete understanding and a holistic view of the environment.
How many people you need for the deployment depends on how big the infrastructure is, what you want to monitor, and the timeline you have.
The on-premises deployment requires maintenance as you have to monitor the server. The cloud requires less maintenance.
What about the implementation team?
We tend to implement the solution for our customers.
What's my experience with pricing, setup cost, and licensing?
The solution can be costly. You have to have a fixed license. It's very difficult for people to know beforehand how much they will be charged.
What other advice do I have?
We're Splunk partners.
For someone who already has an APM solution and is considering switching to ITSI, I'd advise them to look at the licensing and their budget and to consider where their APM is currently lacking. If you aren't getting the alerts you need or you can't see how your infrastructure looks, it might make sense to switch. They need to be aware, however, there will be an extra cost.
Secondly, if you can't see the logs in your application and can't fetch the logs, for example, if you are on Dynatrace, and Dynatrace does not provide your login analysis, you can just go and write a query. However, it depends on what your end customer needs as well. If they need good dashboards and they need flexible dashboarding, to which you can add images, and customize the way you want, you may need something more robust, like Splunk. We were able to pull it off using Splunk ITSI as it gives you very easy-to-customize dashboards.
To someone who's considering a point monitoring system instead of ITSI, I'll say that, depending on your infrastructure, it might be a good idea. If you have less data, and you can manage with the manual alerts, you're fine. However, if you're wasting a lot of time with the alerts and get a lot of alert noise, that means you can be missing major alerts. For major infrastructure, it's a good idea to have ITSI.
You need a minimum of 14 days before seeing time to value. 14 days is required in order to be able to use the complete solution. That allows the system to get good at anomaly detection.
I'd rate the solution eight out of ten.
Which deployment model are you using for this solution?
Hybrid Cloud
Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor. The reviewer's company has a business relationship with this vendor other than being a customer: Partner
Splunk Engineer at Prudent Technologies and Consulting, Inc.
Provides good visibility, reduces alert noise, and improves detection
Pros and Cons
- "The most valuable feature is event correlation, which ensures that only one ticket is generated per issue, eliminating duplicates and reducing noise from multiple alerts."
- "While integrating services and KPIs in ITSI is straightforward, I found it challenging to analyze them with the service analyzers; specifically, using the deep dive feature to pinpoint the exact source and time of an issue proved difficult."
What is our primary use case?
We used Splunk ITSI to monitor service health and key performance indicators across various servers, such as CPU, memory, and disk utilization—advanced detection capabilities based on defined thresholds and triggered alerts. Splunk ITSI, integrated with ServiceNow, facilitated alert generation and management. Additionally, we leveraged ITSI for event analytics and created glass tables based on configuration items. We monitored specific KPIs and generated alerts via ServiceNow based on established thresholds to meet customer requirements.
Some clients have Splunk ITSI deployed in the cloud, and others are on-premises.
How has it helped my organization?
Using a client example, I'll explain the end-to-end visibility provided by Splunk ITSI. We have over a hundred clients in our environment. Once we onboard client data, such as cloud data, we subscribe to that cloud service and integrate the data into our Splunk environment. We then create data models and correlations integrated with the ITSI service. Within ITSI, we create correlation searches and schedule them to run regularly. Each time the Splunk schedule runs, it generates notable events and checks policies to determine if an event qualifies for a ticket. If it qualifies, an episode is created in ITSI, and a ticket is automatically generated in ServiceNow. This is the complete end-to-end process within Splunk ITSI.
We use predictive analytics based on the threshold values to help prevent incidents before they occur.
It does not take long after deployment for our clients to realize the benefits of Splunk ITSI because it immediately reduces alert noise.
Both Splunk ITSI and Splunk Enterprise Security handle incident management, but Enterprise Security utilizes common data models for improved detection. ITSI employs an "episode review" concept to analyze incidents, examining their generation, root cause, trigger alert, and any alerting failures. This provides comprehensive observability of each episode. Similarly, when integrating Enterprise Security with customer systems, pre-built common data models generate alerts that require monitoring to determine their cause, priority, and severity.
Splunk ITSI, using the correlation through event management, can reduce our alert noise.
We can correlate information to receive only relevant alerts, allowing us to quickly respond to issues.
What is most valuable?
The most valuable feature is event correlation, which ensures that only one ticket is generated per issue, eliminating duplicates and reducing noise from multiple alerts. This significantly streamlines issue tracking and resolution. Additionally, the system analyzes service performance by identifying areas of impact and tracking key performance indicators. This deep-dive analysis allows for the precise identification of issues and facilitates data-driven improvements.
What needs improvement?
While integrating services and KPIs in ITSI is straightforward, I found it challenging to analyze them with the service analyzers; specifically, using the deep dive feature to pinpoint the exact source and time of an issue proved difficult. Although I'm proficient in service analytics management, the deep dive aspect requires further development.
For how long have I used the solution?
I have been using Splunk ITSI for two years.
What do I think about the stability of the solution?
Splunk ITSI is stable.
What do I think about the scalability of the solution?
Splunk ITSI is scalable. It is easy to scale on the cloud platform.
How are customer service and support?
The Splunk support team is adequate, but their response time is slow.
How would you rate customer service and support?
Positive
How was the initial setup?
The deployment is straightforward. We acquired a license and integrated it into our current Splunk environment.
What's my experience with pricing, setup cost, and licensing?
Splunk ITSI is a premium application and comes with a premium price tag.
What other advice do I have?
I would rate Splunk ITSI nine out of ten. Splunk ITSI is a valuable tool for IT and operations teams.
I recommend Splunk ITSI. It's an excellent tool for infrastructure monitoring, direct management, and service analytics, providing a clear, consolidated view of your IT environment.
Disclosure: My company has a business relationship with this vendor other than being a customer:
Last updated: Oct 7, 2024
Flag as inappropriateSplunk Consultant at a financial services firm with 1,001-5,000 employees
An intelligent and scalable platform for operational excellence
Pros and Cons
- "The service analyzer view and automatic creation of incidents are valuable."
- "The biggest improvement area is making it open to developers. Right now, it is very closed. It can only be downloaded by people who have a license to and not everyone. If it is open to everybody, more people will use it."
What is our primary use case?
Splunk ITSI is a product for operations. I use it for detecting issues in the operations and generating alerts for them.
It is an intelligence platform for operational excellence.
How has it helped my organization?
The end-to-end visibility is a great thing about Splunk ITSI. It provides an end-to-end view to any user, from a normal engineer to a high-level manager.
We were able to realize the benefits of Splunk ITSI immediately.
Splunk ITSI helps to right-size resources to match the demand. It improves the quality. It is more organized. It can definitely help in rightsizing.
It helps to avoid duplicated alerts. If rightly implemented, it can reduce the duplication of alerts and provide more specific and accurate context.
Splunk ITSI has helped reduce incident volume. The reduction is implementation-dependent. If it is rightly implemented, we can reduce it to a very low percentage. Out of 100, we get only 10 alerts. If the context is correct, we only need one alert. This can be achieved with ITSI.
Splunk ITSI has helped reduce our alert noise, but I do not have the numbers because the initial implementation was not right. There were so many alerts, but when we corrected the implementation, it reduced them by a lot. I do not have the numbers, but thousands have become hundreds.
Splunk ITSI has helped reduce our mean time to detect (MTTD). It is at least five minutes. The mean time to resolve is dependent on the team. I do not have control over that because, in Splunk ITSI, we generate alerts for multiple teams, not just one team. It all depends on their SLAs.
Splunk ITSI helps us to automate alerting and automatically generate alerts or create incidents. It is not an automation tool to reduce mundane tasks.
Splunk ITSI helped us save costs by reducing downtime and manpower costs or avoiding SLA penalties.
What is most valuable?
The service analyzer view and automatic creation of incidents are valuable.
What needs improvement?
Better documentation would definitely help. Many people do not know about it, so better documentation and use case explanations would be helpful. There should be more YouTube videos about how to implement ITSI
The biggest improvement area is making it open to developers. Right now, it is very closed. It can only be downloaded by people who have a license to and not everyone. If it is open to everybody, more people will use it.
For how long have I used the solution?
It has been quite a long time. It has been more than four or five years.
What do I think about the stability of the solution?
It is pretty stable. If we have the proper infrastructure, this tool is very stable. It does not crash.
What do I think about the scalability of the solution?
Its scalability is high. It can scale very well. You can increase the size of the cluster. You can increase the capacity vertically and horizontally. It is very scalable.
How are customer service and support?
They are good. They respond based on the SLAs. The quality of service depends on how informative you are when you provide the case details to them, but they have the ability to escalate it to higher levels and get help. They have the skills, but sometimes, the support is not in the UK. It sometimes comes from the US, so there may be time constraints when you set up a call. Otherwise, they are good.
How would you rate customer service and support?
Neutral
Which solution did I use previously and why did I switch?
I have used other solutions. In the old days, I used a BMC system. Splunk ITSI is a completely different type of alerting system.
The BMC solution is more monotonic. It does not have the intelligence like Splunk ITSI to reduce the noise. It just picks up a metric and alerts based on that threshold, whereas, in ITSI, we have the control to reduce the number of alerts generated on the same threshold by adding some intelligence to it. It has the ability to do that Intelligence part. That is why it is called ITSI.
How was the initial setup?
We have both on-premises and cloud deployment models. Its deployment is difficult for a beginner user. You need a consultant or somebody experienced in Splunk ITSI to implement it properly. Splunk ITSI is a premium product. You need very good Splunk infrastructure initially to run this on top. To run it properly, you should have good knowledge. You should at least have Splunk Architect-level certification. Otherwise, you can implement it, but it will not work properly or as you expect.
It is mostly a clustered solution. It is not normally done on a single server. We need to build the entire cluster. The initial build probably can take two weeks. Configuring everything can take a long time. Six months can be considered a good time to make it run properly for enterprise usage.
It needs regular upgrades, backups, and time-to-time updates to the system configurations. It requires a dedicated team. Once it is properly set up, less than ten people can manage it.
What about the implementation team?
I am an ITSI consultant, so I am not a user. I set it up for customers.
The number of people required depends on how much data we need to bring in. If we have a lot of data and a variety of systems, more people are required. If we are just focusing on a singular system, one person can do the job.
In an enterprise environment, there are a multitude of systems and monitoring requirements. Usually, there is a team onboarding data and setting it up. 10-15 people are a good choice for a big enterprise, like a banking client.
What's my experience with pricing, setup cost, and licensing?
It is more of a premium product. I do not have much visibility into pricing because it is taken care of by high-level enterprise customers. I just ask for the license that I need and they negotiate. It all happens between Splunk and the company. I know that it is expensive, but I do not think there is another solution that can do similar things for that price.
What other advice do I have?
To someone who already has an IT alerting and incident management solution but is considering switching to Splunk ITSI, I would say that it will add value to their organization. It can reduce a lot of noise. I would suggest going for it, but it should be the right implementation. You should have knowledgeable people to implement it from the beginning.
It is not something that you just buy and switch on and will start working. It needs a lot of configuration and proper configuration to make it run properly. That is an important part for Splunk ITSI. It is not just the product. The person who is implementing it should be very good. Then only its value can be seen. Otherwise, you have the application but may not get the right value out of it.
Overall, from my experience, I would rate Splunk ITSI an eight out of ten.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Last updated: Aug 19, 2024
Flag as inappropriateLead Solution Architect at a insurance company with 10,001+ employees
Correlates and aggregates all the information and improves resolution time
Pros and Cons
- "Splunk Episodes are valuable because it correlates and aggregates all the information, and you do not have one million events to look at and triage, so it is quite convenient."
- "It is pretty okay. I am not sure whether the current release has already moved to the new framework where instead of the glass tables, we can directly use the Dashboard Studio. It would be nice to have that integrated into the same framework."
What is our primary use case?
We have some business-oriented monitoring. The technical components are aggregated to business services up to a certain level. We could do a lot more, but this is what we are doing currently.
How has it helped my organization?
Splunk ITSI has improved our mean time to resolution. We can essentially notice things before somebody calls. We have better customer satisfaction. It is hard to say how much time it has saved, but if we do not use it, it will take quite a while until we notice something is down or until we find out what exactly is the issue.
We monitor multiple cloud environments with it. It is no more difficult than anything else.
Splunk ITSI has end-to-end visibility into our cloud-native environment. We also have SignalFx. We are an early adopter of SignalFx in Switzerland. It is integrated, and we have been beta-testing the integration. It is quite easy and workable. It is quite nice.
It provides business resilience by empowering staff. That is the core feature. You can tailor the solution and give the exact information in a certain context. This correlation and this presentation help the business, the users, or the person responsible for the application or the stack. That is the interesting part.
What is most valuable?
Splunk Episodes are valuable because it correlates and aggregates all the information, and you do not have one million events to look at and triage, so it is quite convenient.
What needs improvement?
The solution is okay. I am not sure whether the current release has already moved to the new framework where instead of the glass tables, we can directly use the Dashboard Studio. It would be nice to have that integrated into the same framework.
For how long have I used the solution?
We have been using Splunk ITSI for more than four years.
What do I think about the stability of the solution?
Its stability is excellent.
What do I think about the scalability of the solution?
Its scalability is excellent.
Which solution did I use previously and why did I switch?
They used different tools for different parts. For the service aggregation part, they used Netuitive. They still use Dynatrace for some of the things, but they have mostly moved to SignalFx. Dashboarding was one area for which they never had anything.
The guys with the container-based workload absolutely demanded SignalFx. That had the repercussions of finally moving to Splunk ITSI.
How was the initial setup?
I was not involved in its deployment.
What was our ROI?
I am not sure about the ROI of Splunk ITSI, but we have definitely got an ROI from Splunk. We have been using Splunk since version 3 and doing lots of things. We have hundreds of use cases. If you ask anybody in the business, they would say that it is essential and critical.
Splunk has improved our business resilience in combination with Splunk Enterprise. It is widely adopted by our developers, and we also have a fairly large number of dashboards where core services, such as managed file transfer, are transparent for the users that own a system that is connected as a sending or receiving device so that they can self-service and check if everything is working. There is also alerting on that. So, there are multitudes of use cases. It is more of a framework; it is more of a platform. There is wide adoption of it. 100% of the users in the company have access to it. Not everybody uses it, but everybody has access to it.
What's my experience with pricing, setup cost, and licensing?
It is interesting. I am not involved that much lately, but if I recall correctly, you license primarily on the volume of data that you are using in Splunk ITSI, but there is no way Splunk can ever check if that is true, so that is interesting. We are not doing it, but someone can pretend to just use 10%, and it would be super cheap. It is tricky, but it is more tricky for Splunk than for us.
Which other solutions did I evaluate?
There were quite a few solutions that we looked at. We were beta testing Splunk ITSI, but unfortunately, the adoption was not possible back then. They had a few market-leading products in the procurement. Due to SignalFx, we finally chose Splunk ITSI.
What other advice do I have?
I would rate Splunk ITSI an eight out of ten.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Technical Associate at Positka
It gives our customer complete visibility from one dashboard, helping them to develop a proactive response
Pros and Cons
- "We save substantial time on monitoring tasks because we don't have to search for what we need. Everything is packed, so you can drill down to the end values by just doing the kit. We don't spend a lot of time on this. Splunk ITSI is easy to use and not time-consuming."
- "We're using predictive analytics, and there are three or four algorithms. It would be helpful if this process were more standardized and scalable."
What is our primary use case?
We use Splunk ITSI for IT monitoring. It helps us monitor all our servers for things like CPU utilization and other performance metrics. We can integrate complex architectures with the service and connect the core to multiple data sources. Our customers' environments vary. In the last project, they had around eight departments and 75 employees, so I needed a web server for each department.
How has it helped my organization?
Before we shifted our customers to Splunk ITSI, they had issues getting insights in some circumstances. Now they have complete visibility from one dashboard. It helps them monitor and develop a proactive response to address the problems before they cause trouble.
One issue we faced before implementing Splunk was that our customers couldn't predict how long it would take to reach their storage limit. Now we can categorize issues according to severity.
Splunk ITSI has enabled us to streamline incident management by adopting aggregated policies. Instead of getting rid of incidents, we are placing these into several groups and removing the duplicates to see some insights based on previous incidents.
We've been able to reduce alert noise using policies. By grouping the policies, we're able to avoid redundant alerts. When we used the other solution, we would sometimes get repeated warnings, but we eliminated that by implementing aggregate policies.
From IPSI, we can see the metrics and drill down. We can build a tool to check the metrics based on severity. Instead of taking every event's logs, we are directly getting the root cause of the issue. From there, we can see that it obviously reduces the rest of the time.
The solution has reduced our mean time to resolve issues. Before implementing it, we typically needed around six to eight hours to close a ticket. When we had an alert, we had to review all the native logs to find the correct server. With ITSI, I can see a score that tells me about potential issues before they arise. I can see if there is a critical problem with a server or application based on the data flows and resolve it.
What is most valuable?
I like ITSI's service analyzer. We can integrate and group the service, then create multiple KPIs in the service analyzer we can monitor. We can use multiple connectors to get end-to-end network visibility. Many organizations prefer appliances, and we can completely integrate the appliance with the source to gain complex insights throughout the network.
We are getting real-time insights from the service and the vendor and doing some projects using security analytics to check the path. We can monitor the behavior of an appliance or the organization and how they are using it. For example, you might see high usage on specific days and low usage on weekends. If we can identify patterns from this, it can help us predict the future.
What needs improvement?
We're using predictive analytics, and there are three or four algorithms. It would be helpful if this process were more standardized and scalable.
For how long have I used the solution?
I have used Splunk ITSI for nearly a year.
What do I think about the stability of the solution?
Splunk ITSI is stable. The latest version is more stable than the previous one.
What do I think about the scalability of the solution?
Splunk ITSI is scalable. We can compare multiple APIs and services, so everything is organized and manageable. We can drill down to the bottom of all the logs on events.
How are customer service and support?
I rate Splunk technical support a nine out of ten. If we work with cloud architecture, we usually need some help from Splunk, so we often need to contact support and ask for changes. We prepare the case, have a conversation with them, and get it done.
How would you rate customer service and support?
Positive
Which solution did I use previously and why did I switch?
We were using service providers, but we had a log management solution and some other open source tools. We relied on custom builds of open source solutions.
How was the initial setup?
Splunk ITSI can be deployed in the cloud or on-prem depending on the customer's requirements. For example, if someone is running this in a closed environment, we can go with the on-prem deployment. Otherwise, customers will mostly go for a cloud deployment. We use AWS.
When I started the training, it seemed somewhat complicated, but once you learn a bit, it becomes straightforward. It isn't terribly complex. The deployment strategy depends on the scope of the project, such as whether you have a cluster or a distributed environment.
You can deploy it with a team of three or four. Someone needs to take care of the prerequisites like clustering and another person might take care of the integration. Another will configure the dashboards. The process takes about five days.
What was our ROI?
We save substantial time on monitoring tasks because we don't have to search for what we need. Everything is packed, so you can drill down to the end values by just doing the kit. We don't spend a lot of time on this. Splunk ITSI is easy to use and not time-consuming.
The time to value is fast. The implementation takes time, but the customer can see value immediately once everything is configured, permissions are set, and we're ready to move.
What other advice do I have?
I rate Splunk ITSI a 10 out of 10. We need our website up 24/7, or we'll lose business. Every minute that it's down we lose money. I would recommend this to anyone who runs a business online and needs to monitor their infrastructure.
If you're considering a point monitoring system instead of ITSI, I would say it depends on the information you are using. Generally, Splunk ITSI is the advanced option that gives you multiple features together with service intelligence and analytics. You can make wonderful dashboards. Comparatively, this is enough to monitor the company's infrastructure.
In ITSI, we can also integrate application and database logs, so the customer might get some research to predict when the database goes down. ITSI can be helpful to manage the customer infrastructure and minimize the impact on their business.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: My company has a business relationship with this vendor other than being a customer: Reseller
Buyer's Guide
Download our free Splunk ITSI (IT Service Intelligence) Report and get advice and tips from experienced pros
sharing their opinions.
Updated: October 2024
Product Categories
IT Alerting and Incident Management Application Performance Monitoring (APM) and ObservabilityPopular Comparisons
Splunk Enterprise Security
Elastic Observability
SolarWinds NPM
PRTG Network Monitor
ServiceNow IT Operations Management
Buyer's Guide
Download our free Splunk ITSI (IT Service Intelligence) Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links
Learn More: Questions:
- How do you decide about the alert severity in your Security Operations Center (SOC)?
- What is an incident response playbook and how is it used in SOAR?
- What is the difference between mitigation and remediation in incident response?
- What tools and solutions do you use for automated incident response in an enterprise in 2022?
- What measures should a business have in place to enable an effective incident response for data breaches?
- Why a Security Operations Center (SOC) is important?
- What are some Incident management best practices to keep in mind?
- GoDaddy has been hacked again. What can be done better?