Try our new research platform with insights from 80,000+ expert users
Tech Lead at a tech vendor with 1,001-5,000 employees
MSP
Top 20
Provides a unified view of alerts and supports heat maps and glass tables for visualization and monitoring
Pros and Cons
  • "I find the episode review, glass tables, and correlation search features very useful."
  • "Microservices is the only area where Splunk ITSI can be improved. When things come from one EC2 instance to another, there's a lack of exposure to microservices, so we can't know what's happening. Apart from that, it's doing pretty well."

What is our primary use case?

There are multiple use cases, which include heat maps, glass tables, and predictive analysis.

The first one is mainly related to heat maps. For example, if you want to monitor the health of a server, you can prepare heat maps for that. When you set up any kind of alerts, they can get missed because people are too busy to check their emails. With these heat maps, the color changes automatically. The Cron job runs behind the scenes, and you don't need to run them manually. 

You can also set up a glass table in ITSI for the architecture. For example, a setup like Amazon would have web services, databases, queues, and other things. For the purchase and other things, it has to connect to the external world, so you need to place the complete architecture over there, and you can assign the threshold value. If there is an issue with any of the points, for example, there is an issue with the connectivity of the database, the heat maps would change in color, which helps you to easily identify that there is an issue.

It also has a concept called predictive analysis. For example, your WhatsApp chat backup happens every 24 hours or 7 hours, but you cannot predict how much bandwidth it's going to use during the backup. It might even use 100% of the bandwidth. You cannot set a proper threshold. In such cases, you can use predictive analysis. It'll analyze the data patterns, and based on the data pattern, it predicts if everything is good or if something is bad. It can predict if something is going to fail.

You can have an integration with the ticketing tools. For example, if something happens on any server or PC and you've directly integrated the tickets from Splunk to ServiceNow, it's automatically going to create a ticket in ServiceNow.

There's also a concept of episode review wherein it groups the alerts so that there's no ticket spam in ServiceNow. For example, if you are monitoring a server and it's down, there might be 10 to 20 alerts, which would create 10 or 20 separate tickets and spam your ticketing system. In such cases, you can use the episode review feature. It will merge all those tickets into one and include all the details in that.

How has it helped my organization?

Splunk ITSI allowed us to monitor the health of servers. We can also completely monitor an application and identify data patterns. Automation of ticketing tools can also be done with this. We can also do log monitoring with Splunk ITSI.

It's also helpful for developers. When they create an application, if there is an issue in their code, based on the output data, a request is automatically triggered to the engineering team stating that there is an issue with the code.

The visibility into an application is very good if you configure everything properly. You first have to analyze the application by using any of the monitoring tools such as Elastic, Splunk, etc. You have to analyze the application in and out, and afterward, you have to place the monitors in particular places for end-to-end visibility. For example, in the case of a home security system, to completely secure the home, you have to place the devices in a proper place. Until and unless you place the devices in a proper place, you cannot say that it's completely secured. If you are not keeping the cameras at the main entrance and the windows, or you haven't placed them properly, you can't say that the home is properly secured.

Splunk ITSI is very good for predictive analytics for preventing incidents before they occur. For everything, there are patterns, and based on the algorithm, you are allowing the machine to analyze the data and predict whether the data patterns are coming in a proper way or not. Splunk analyzes the data patterns based on the historical information that we give it. After analyzing the historical information, it creates triggers. If the data that we are feeding into the machine is incorrect, it's not going to work the same way.

There's the accuracy of alerts. In Splunk, the data is almost in real-time, so we get tickets in real-time. If there's a failure, we can roll over to the backup applications immediately. It saved about a million euros for one of our clients. They were having an issue with the Symantec antivirus that blocked the complete Citrix environment, so the workers were not able to sign in and access the application, which led to an outage. Within a matter of minutes, Splunk triggered a ticket, and they identified that they were having an issue with this particular antivirus, and they blocked it.

Splunk ITSI has helped streamline our incident management. There is efficiency in terms of clubbing the tickets and sending tickets with meaningful information, so mainly with the alerting system, you can configure as much information as you want using the Splunk monitoring tools. You can send some links in the ticket, or you can send a separate set of guidelines for the engineers on what has to be done. The clubbing of tickets has also helped a lot to avoid spamming.

Splunk ITSI has reduced our mean time to detect. Based on my experience and the feedback from others who are using it, it has saved a lot of time. The time reduction is significant when compared to other tools in the market.

It has reduced our mean time to resolve. Glass tables have been very helpful. With the help of Splunk ITSI, you can place the heat maps and services in place based on the application architecture to easily identify where the issue is coming from.

What is most valuable?

I find the episode review, glass tables, and correlation search features very useful.

What needs improvement?

Microservices is the only area where Splunk ITSI can be improved. When things come from one EC2 instance to another, there's a lack of exposure to microservices, so we can't know what's happening. Apart from that, it's doing pretty well.

Buyer's Guide
Splunk ITSI (IT Service Intelligence)
October 2024
Learn what your peers think about Splunk ITSI (IT Service Intelligence). Get advice and tips from experienced pros sharing their opinions. Updated: October 2024.
824,053 professionals have used our research since 2012.

For how long have I used the solution?

I've been using Splunk ITSI for five or six years.

What do I think about the stability of the solution?

I'd rate it a nine out of ten in terms of stability.

What do I think about the scalability of the solution?

I'd rate it a nine out of ten in terms of scalability.

How are customer service and support?

It isn't 100% satisfactory for all the cases. About 80% of the time, they are good, and about 20% of the time, they aren't as good. They can be very slow. We also had an incident where we asked them to upgrade to a version, but in that latest update, Splunk had removed some concepts because of price issues. As a result of removing a particular module, our complete environment failed. It took us a day to roll back the version and go back to normal. Overall, I'd rate them a seven out of ten.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

I used VMware vSphere and a CA Technologies tool. We switched to ITSI because the optimization was very less in them. There is also a significant difference in data parsing. We also have real-time data. 

How was the initial setup?

At the beginning of my career, I found it to be complex because you need to know a lot of areas, such as network and firewall rules, routing methodologies, and the cluster concept. I kept on learning along with my teammates, and it's pretty good now.

What about the implementation team?

In the beginning, my teammates helped me, but now I don't need any help. Depending on the load and the environment, I can build things.

What was our ROI?

One of our clients was paying two hundred thousand to three hundred thousand dollars for a report based on the complete data, whereas they could also get the data by running a couple of queries from the database. After the implementation of Splunk, we used something called DB Connect. It was a small tweak, and after that, the price was reduced to a hundred dollars or eighty dollars per annum. All they are doing now is creating or running SQL queries, getting the data back in Splunk, and based on that, triggering and sending a report. That's it. It was all about preparing proper monitoring. The data was already available. We prepared the alerts. Along with the alerts, we also prepared dashboards for the users to visually review the historical information for the past one or two years. They can even see the report month-wise. Two hundred thousand dollars to less than a hundred dollars is incomparable.

What's my experience with pricing, setup cost, and licensing?

Its pricing has been changed as per the market. You get a good support service with it as well. They have 24/7 customer support. There is a portal, and if you are having issues, they are available in order to resolve them. So, its pricing isn't too much.

What other advice do I have?

I'd advise learning the tool properly, understanding its capabilities, and utilizing it efficiently. One of our clients was paying hundreds of dollars towards the license, but they were utilizing it only for server monitoring. 

To someone who already has an APM solution but is considering switching to Splunk ITSI, I'd say that switching to ITSI is going to help them a little bit more. The grouping of the ticket to the users can be easily planned. It's not rocket science. It's easier compared to the other tools where you need to create a lot of configuration for that. The configuration has been segregated, which makes it easy for the applications team to set up their own monitoring and group them to avoid the number of tickets generated. You also have predictive analysis along with heat maps and glass tables, which aren't available in other APM tools in the market right now.

Overall, I'd rate Splunk ITSI an eight out of ten.

Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
PeerSpot user
Benjamin Agbanowe - PeerSpot reviewer
Splunk ENGINEER at a transportation company with 201-500 employees
Real User
Offers enhanced visibility, reduces costs, and minimizes the frequency of incidents
Pros and Cons
  • "Splunk ITSI offers a valuable visualization tree that allows us to map and analyze dependencies and co-dependency within our environment."
  • "ITSI currently lacks the capability for automated response, mitigation, and remediation."

What is our primary use case?

Splunk ITSI is a service intelligence platform that monitors services, availability, endpoints, and interactions within an environment. My experience with ITSI focuses on web application APIs. I installed and configured it for a telecommunications company to monitor web application API services, troubleshoot downtimes, and mitigate failures. ITSI offers a comprehensive view of the environment, enabling top-to-bottom visibility into services, endpoints, and performance. It provides correlation analysis, deep dives, and episode reviews, leveraging AI and machine learning algorithms to detect signals, predict issues, and prepare engineers for potential problems.

How has it helped my organization?

Splunk ITSI's dynamic and highly beneficial end-to-end visibility allows us to gain comprehensive and clear visibility once we configure our settings, services, and entities.

Splunk ITSI's machine learning and AI capabilities are powerful tools that help prevent incidents before they occur. As an engineer, I appreciate the ability to visualize potential future scenarios within my environment. This predictive forecasting feature provides valuable insights into our environment and services.

Due to its complex functionalities, Splunk ITSI requires significant learning. Proper training is essential to understand how these features operate effectively. While the benefits were not immediate, they became apparent over time as we configured, implemented, and utilized the various functionalities. It took several months before the full value of Splunk ITSI was realized.

For incident management and incident response, ITSI assists us by enabling us to create numerous knowledge objects as Splunk users. Whenever an issue arises, these objects can be centered around our services or entities, such as reminders, emails, or notables. Consequently, ITSI significantly aids our management and incident response efforts.

Splunk ITSI effectively reduces the volume of incidents by providing predictive capabilities, enhancing environmental visibility, and facilitating efficient troubleshooting. This deep-dive approach minimizes the occurrence of noisy alerts and consequently lowers the overall incident rate.

It helps reduce alert noise by allowing users to review and group notables. Through the episode review functionality, analysts can examine fired alerts, assign them to specific investigators or analysts, and group them to minimize the occurrence of noisy alerts.

Splunk ITSI has been instrumental in reducing the mean time to detect. While I have other tools as an engineer, ITSI, in conjunction with Splunk SOAR, offers preconfigured automation and quick responses that can further enhance our MTTD. ITSI provides the necessary visibility, and when integrated with SOAR, it aids in detecting and resolving issues more efficiently. These tools work seamlessly together, streamlining our incident response process and improving operational efficiency. Combined, our MTTD is under 30 seconds.

Splunk ITSI has helped reduce the mean time to resolve the issue because we can detect the incidents faster.

It is a valuable tool for cost savings. In a recent project involving web application APIs, ITSI's top-to-bottom visibility and machine learning capabilities enabled us to predict and prevent downtime, reducing losses significantly. By integrating ITSI with an automated tool like SOAR, we implemented automated responses that quickly resolved issues and minimized disruptions. This resulted in substantial savings, estimated to be between five and ten million dollars. Before ITSI, downtime in the web payment application APIs was frequent, leading to significant financial losses. ITSI's implementation has eliminated this issue and provided substantial cost benefits between five and ten million dollars.

What is most valuable?

Splunk ITSI offers a valuable visualization tree that allows us to map and analyze dependencies and co-dependency within our environment. We can quickly identify errors, failures, and cascading impacts from specific branches by inputting our services and entities into this diagram. I have found this feature particularly useful for clearly understanding my environment's dynamics. Additionally, ITSI's deep dive functionality enables detailed examination of service trends over time, providing valuable insights. Furthermore, its AI and machine learning capabilities, especially beneficial for users with relevant knowledge, offer powerful predictive and correlation analysis tools. Overall, ITSI's combination of visualization, deep dive, and AI and ML features makes it an indispensable tool for observability and understanding complex environments.

What needs improvement?

ITSI currently lacks the capability for automated response, mitigation, and remediation. To achieve this, it must be integrated with third-party applications. Adding these features to ITSI would significantly enhance its value. For example, the ability to define specific conditions and triggers for automated responses to alarms or incidents would enable proactive mitigation and detection. Incorporating automated response and detection functionalities into Splunk ITSI would make it a powerful tool for incident management.

For how long have I used the solution?

I have been using Splunk ITSI for seven years.

What do I think about the stability of the solution?

Splunk, as a platform and software, typically operates smoothly without significant lag or crashes. When such issues arise, they are often attributed to insufficient memory or hard drive space allocated for the Splunk installation. These factors are primarily dependent on the project owners and company's available resources and hardware capabilities. However, it's important to note that the Splunk platform itself rarely encounters stability problems.

What do I think about the scalability of the solution?

Splunk ITSI assists in optimizing resource allocation to align with demand. We can effectively manage our infrastructure by accurately predicting resource requirements based on factors such as the environment, project, and specific operations within our facility. Splunk ITSI's machine learning capabilities can also contribute to this predictive analysis or forecasting, further enhancing our ability to optimize resource utilization.

How are customer service and support?

The technical support responded quickly and provided high-quality assistance. They paid close attention to our issue, conducted a remote diagnosis of our environment, and clearly explained the problem and recommended solutions. Their service was exceptional.

How would you rate customer service and support?

Positive

How was the initial setup?

The initial deployment of Splunk ITSI is straightforward. Assuming all other configurations are in place, a full deployment can be completed in approximately 30 minutes. The exact duration depends on the complexity of the environment, including the number of indexers, search heads, and overall workload. For a single installation on a standalone computer with minimal infrastructure and support requirements, the deployment can be completed in just a few seconds.

The number of Splunk ITSI consultants required for a deployment depends on the project's size, architecture, and specific monitoring needs. A small, single-deployment project may only need one consultant. However, larger projects involving clusters of indexers or searchers, or those requiring constant monitoring, may necessitate more consultants. Such complex deployments might require two or three consultants to manage the entire environment effectively.

What other advice do I have?

I would rate Splunk ITSI eight out of ten.

To anyone considering switching to Splunk, I highly recommend it. Splunk offers a wide range of applications, making it a versatile tool for various IT environments. Beyond ITSI, Splunk provides numerous tools and platforms that offer comprehensive insights into IT operations, security, and more. Whether dealing with payments, web application APIs, or any aspect of IT, Splunk can help. Splunk empowers you to gather, search, analyze, and visualize data to create knowledge objects and set endpoints. It enables you to secure, analyze, and query your IT environments, providing valuable insights. Splunk's powerful features, including AI and machine learning algorithms, help you detect issues, streamline alerts, and improve overall operations. Splunk's risk-based alerting and ITSI security features ensure data protection and compliance. It helps safeguard your data in transit, storage, and indexing, providing visibility into access and potential leaks. For compliance, vulnerability, and risk management, Splunk is a valuable asset. I strongly recommend installing Splunk for its ability to enhance IT operations, improve visibility, and ensure security. If observability is a priority, I also encourage exploring Splunk ITSI.

Splunk ITSI is available both in the cloud and on-premises.

For new users, consider hiring a Splunk consultant to provide initial guidance and training. The consultant can demonstrate key features, share best practices, and help you get started. Secondly, familiarize yourself with Splunk's extensive documentation, which is a valuable resource for learning and troubleshooting. It's essential for anyone involved in managing or using Splunk to stay updated on the latest information. Finally, having a consultant work directly with your team can accelerate the learning process. They can provide tailored training, assist with implementation, and ensure that your users are equipped to effectively utilize Splunk's capabilities.

Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
Flag as inappropriate
PeerSpot user
Buyer's Guide
Splunk ITSI (IT Service Intelligence)
October 2024
Learn what your peers think about Splunk ITSI (IT Service Intelligence). Get advice and tips from experienced pros sharing their opinions. Updated: October 2024.
824,053 professionals have used our research since 2012.
Nagendra Nekkala. - PeerSpot reviewer
Senior Manager ICT & at Bangalore International Airport Limited
Real User
Top 5Leaderboard
Empowers organizations to efficiently monitor, analyze, and optimize complex IT environments
Pros and Cons
  • "The most valuable aspect lies in its utilization of predictive analytics to anticipate and prevent incidents within a window of twenty to thirty minutes."
  • "It would be advantageous to enhance the dashboard by incorporating sections for monitoring, service health, and a filter for the KPIs."

What is our primary use case?

It has enabled effective monitoring, allowing for a comprehensive view of the growing complexity within the IT infrastructure.

How has it helped my organization?

The enhancement to our organization stems from its ability to consistently run rules, actively identifying significant events. This involves an ongoing process of aggregating and configuring notable events into a coherent resource. Additionally, the container version automates website functionalities, including tasks like email reception, providing a heightened level of control.

It has proven highly effective in real-time monitoring of service assistance and KPIs. There has been a noticeable enhancement in automated event clustering. Additionally, the platform facilitates comprehensive analysis for proactive incident prevention.

The end-to-end visibility provided into our network environment is a potent tool for real-time monitoring. It significantly contributes to the monitoring and analysis of complex multi-cloud IT solutions, playing a pivotal role in ensuring efficiency.

Leveraging predictive analytics to proactively prevent incidents before they manifest empowers operations to establish effective management and automation of information related to business processes.

It aids in minimizing alert noise, proving highly effective in incident management. Furthermore, it facilitates root cause analysis.

What is most valuable?

The most valuable aspect lies in its utilization of predictive analytics to anticipate and prevent incidents within a window of twenty to thirty minutes. It promptly raises a red flag, signaling an effective early warning system.

The resilience it provides is invaluable. It ensures continuous application of rules, specifically for identifying notable events, and utilizes revision policies to configure hardware solutions into edge servers. This is essential for my operations to seamlessly proceed.

What needs improvement?

It would be advantageous to enhance the dashboard by incorporating sections for monitoring, service health, and a filter for the KPIs.

For how long have I used the solution?

I have been using it for one year.

What do I think about the scalability of the solution?

It provides good scalability. Approximately, a hundred users use it effectively.

How are customer service and support?

I would rate the customer service and support eight out of ten.

How would you rate customer service and support?

Positive

How was the initial setup?

The initial setup was straightforward.

What about the implementation team?

The installation involves developing a strategy to comprehend the essential services for proper monitoring. Additionally, it entails determining the specific type of intelligent alerts, clusters, and dashboards needed for effective planning. It was done in-house by one individual.

What was our ROI?

The implementation of this solution quickly demonstrated its value.
It resulted in a time reduction of six hours through its implementation.

It contributed to a six-hour reduction in the meantime to detect incidents.

It assisted in decreasing the mean time to resolve by four hours.

What other advice do I have?

Choosing IT Service Intelligence (ITSI) over other vendors is a superior option now, as it operates on a data platform capable of efficiently collecting and managing large volumes of machine-generated data. It would greatly support the utilization of proper predictive analytics due to the capability to preemptively prevent incidents ten to twenty minutes in advance. Overall, I would rate it eight out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Splunk admin/devepor at Wipro Limited
Real User
Top 20
Reasonably priced with good monitoring and predictive analytics
Pros and Cons
  • "We can automate routine tasks. We're able to create alerts, reports, scheduled searches, et cetera. It's helping us to save time."
  • "When we check the service analyzer, and we have custom inputs, there are issues."

What is our primary use case?

We are using the solution for correlation searches. We've integrated Splunk with ServiceNow. We're creating aggregation policies to trigger actions in ServiceNow. We use it with the ServiceNow add-on. When something happens in ServiceNow, it's correlated to ITSI as well. 

How has it helped my organization?

We can check to see if dependent services are aligned. The service analyzer allows us to see the health of the services. 

It's been very good for noise reduction. We have alerts that trigger visually and it helps us prioritize. We can create performance-related dashboards so teams will have a clear overview according to their unique requirements. 

What is most valuable?

The infrastructure monitoring is very useful. In our scenario, we can see the performance of logs across parameters like memory or security. We can analyze the data. We can create our own logic and alerts to send to the correlated teams to take care of incidents. 

The end-to-end visibility is very good. With the service analyzer, we're able to see if something goes down. It's inspecting the health of services. It's color-coded, so we can check to see if there are any serious issues. We can do deep dives if something is red. 

We use the predictive analytics on offer. We have some use cases in which we create forecasts around CPU and memory-related alerts. We can use it to predict costs based on the past 30 or 40 days. We're also trying to use this for anomaly detection. We can make good predictions on the basis of data and trends. As long as we have past data, we can use it to build some predictions for the future. We can use this to create and send predictive reports to our teams to help them take pre-emptive action.

It's helped us to right-size resources to match demand. 

The solution has helped us streamline our incident management. We've been able to increase efficiencies through automation.

We've been able to reduce incident volume. If a host is generating frequent tickets, for example, we're able to see it and work on it directly to help us reduce incident counts. 

We've been able to effectively reduce alert noise. We can create logic to create tickets. It will create one ticket per episode so that multiple tickets are not created for one single episode - and this helps us reduce noise. 

We can automate routine tasks. We're able to create alerts, reports, scheduled searches, et cetera. It's helping us to save time.

What needs improvement?

When we check the service analyzer, and we have custom inputs, there are issues. Sometimes our inputs are not taken or recognized. Alerts are not being automatically generated. Also, if someone comes and creates a maintenance window, we can't properly identify who created it. We have to create our own queries before we can identify anything. 

For how long have I used the solution?

I've been using the solution for three years. 

What do I think about the stability of the solution?

The solution is very stable. 

What do I think about the scalability of the solution?

The solution is scalable. Depending on your infrastructure, it can be a bit tricky. 

How are customer service and support?

I haven't had to escalate any issues to technical support. 

Which solution did I use previously and why did I switch?

We're using SolarWinds and Splunk in our current environment. 

How was the initial setup?

I helped with the initial deployment. We have multiple servers sending data to Splunk. The process is straightforward. For the setup, we had three people involved in the process. 

It's not a difficult solution to maintain. 

What's my experience with pricing, setup cost, and licensing?

The licensing is based on data ingestion. However, they do have multiple licensing options.

The pricing is reasonable. 

What other advice do I have?

Splunk is good. It gives good customized options. Any logic or Python script we need to add, we have the freedom to do so. Most solutions aren't as customizable. 

I'd recommend the solution to others. I'd rate it eight out of ten. 

Which deployment model are you using for this solution?

On-premises
Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
Flag as inappropriate
PeerSpot user
Dishank Saxena - PeerSpot reviewer
Site Reliability Engineering Manager & DevOps Lead Global at a tech vendor with 10,001+ employees
MSP
Top 10
Reduces time to resolve and alert noise but is missing a release comparison feature
Pros and Cons
  • "The root cause analysis is very helpful for us."
  • "Predictive analytics, in terms of preventing incidents before they occur, still needs time to mature."

What is our primary use case?

We use the solution for event management, observability, application management, application performance management, anomaly detection, problem detection, and creating different rules for the anomalies for different events. It's application performance monitoring. The entire area of service is managed by ITSI, and offers automated detection and everything.

What is most valuable?

The root cause analysis is very helpful for us. 

There's one feature which is a prediction and detection feature that we have gone through. We are not thoroughly using it. However, for us, I would say that root cause analysis, problem detection, and anomaly detection are the most helpful features.

The end-to-end visibility of IT assigned to our network environment is great. The endpoint visibility is definitely helpful, and that is mainly for the application team. We can take a deep dive into the incident. In the everyday work that we do, we don't really use endpoint visibility since that is not required if we look at normal and general use cases. That said, when it comes to an incident during an outage, end-to-end visibility helps us deep dive or drill down to find out the root cause and how to make the platform better for the future.

The product has helped to streamline our incident management with end-to-end visibility. It helps in streamlining the incidents that are coming in. For example, for the authentication service that we have, users for certain regions are not able to authenticate completely. That likely means there's an issue with that region. That is an incident. In that case, I would look at endpoint visibility from the infrastructure to the end of the service call, including all the scans, tracing, and everything. Looking at it helps provide a resolution.

Our alert noise has been reduced.

Our main time to detect has been reduced as well. Previously, we used to take a lot of time getting to the root cause of what happened. We've been able to resolve this quicker, and our main time to detect has been drastically reduced. 

In addition, we've been able to reduce the time to resolve.

What needs improvement?

Predictive analytics, in terms of preventing incidents before they occur, still needs time to mature. I am not very, I would say, convinced of the prediction feature's capabilities.

It does not have a release comparison on the server comparison feature. For example, if you have an application, and you introduce a new feature, and you're going to deploy it, then the release comparisons should show automatically or generate a report to show the impact of the feature on the overall application. It should show what you can do to optimize it. 

For how long have I used the solution?

I've used the solution for around five years.

What do I think about the stability of the solution?

The stability has been good. 

What do I think about the scalability of the solution?

The solution is highly scalable and flexible. 

How are customer service and support?

I've contacted support multiple times. Their service is average. They are not very quick. 

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

I've used a few different solutions, like Dynatrace and Datadog. I've used Elasticsearch and Moogsoft as well.

Dynatrace is an overall package. I'd choose it over ITSI. Splunk is never a package. It does not provide application performance monitoring. Dynatrace is a full-fledged APM tool that includes infrastructure, APM, synthetic monitoring, and user monitoring alongside AI ops, which are very strong. It's a mature platform.

How was the initial setup?

I was involved in the initial setup. It's a very straightforward process. Deploying the platform takes a couple of hours at a maximum. The configuration is more subjective in terms of how long it takes. For example, how many applications do you have? How many environments? We have three environments in the US, and with approvals, it took us around 20 days.

It's a SaaS solution and does not require maintenance. It's a one-click upgrade if you want to upgrade anything. 

What about the implementation team?

Once you buy a license, Splunk is involved and can help with the deployment. They have three or four free consulting sessions initially. They are very involved in the pilot phase. post-pilot, you have regular support. 

What's my experience with pricing, setup cost, and licensing?

The product is expensive. It's one of the most expensive options, although maybe not as expensive as Datadog.

What other advice do I have?

We might be partners with Splunk. 

It's readily available. You don't have to wait very long to witness the benefits of the solution. 

I'd rate the solution seven out of ten. 

If you are looking for an AI solution alongside APM, use a platform with everything in place. However, if you still want to go for a dedicated AIS platform, make sure it integrates with your existing logging and APM tools. However, my position is that it's better to use one platform for the entire opportunity.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor. The reviewer's company has a business relationship with this vendor other than being a customer: Partner
PeerSpot user
Works at a comms service provider with 1-10 employees
Real User
No other tool provides you with the same level of observability and enterprise security or the search and reporting applications
Pros and Cons
  • "The most valuable feature of ITSI is the service KPIs. No other tool provides you with the same level of observability and enterprise security or the search and reporting applications."
  • "ITSI is an almost perfect tool, but there is room for improvement in a few features like the deep dive and multi-KPI alerts. We're using most of the features like service API, coding searches, and aggregation, but our team members hardly use multi-KPI and deep dive. We don't use the multi-KPI or deep dive because everything is available in the service KPI. I don't think this feature is necessary."

What is our primary use case?

In my recent projects, we have used ITSI to monitor the entire infrastructure using multiple features, such as service KPIs, aggregation policies, base searches, correlation searches, notable events, dashboards, blast tables, service analyzers, and drill-downs.

How has it helped my organization?

It helps in every respect, including performance, monitoring, or visualization of the important indicators. It improves the quality of service to the clients. It is crucial that the clients have no website failures because that means the loss of business. ITSI helps us track those issues. We've seen fewer environmental failures since we started using ITSI.

We saw immediate benefits from Splunk ITSI. For example, let's say you have a project for monitoring hybrid Linux servers running JBoss, SAP, and any server containing a client's critical data. It isn't easy to monitor each of these through the back end. 

Splunk ITSI shows you all the data on the screen and lets you visualize the data from various applications. We can see all the applications running on the server and issues with CPU or memory utilization. We have that data in Splunk and can immediately see the alerts triggered. If there are any failures in the environment, we can fix them in seconds. 

The solution has helped us streamline our incident management. We can monitor server KPIs, which trigger an alert if the server is impacted. We can track all the notable events and integrate ServiceNow with Splunk. ITSI is integrated with the ticketing tool, so when an alert triggers, it automatically creates a ticket on ServiceNow. 

ITSI has also reduced the alert volume. Before ITSI, we were unsure why an issue happened. We would see the alerts triggered in bulk and log them one by one for every server. ITSI gives you a feature that lets you drill down to find the precise issues on the server. 

It has a service KPI feature that allows you to monitor exceptions that may lead to server failure. For example, we might be in trouble if the value exceeds 10. We put five or eight values in the threshold field with a high criticality, so it triggers an alert whenever the count is breached. 

ITSI reduced our alert noise because it was very hard to monitor every aspect when we used search and reporting. After running the query, we needed more insights, and ITSI gave us a clearer picture of the incident. That helps you reduce issues.

Many use cases can be automated through ITSI because we previously built our reports manually.  After introducing ITSI, we sent all the data via the forwarders to Splunk. Once we have the data, we create and schedule all those queries and reports so that the management can see them without any IT involvement. It previously took us two or three hours daily to create all those reports, so automating reports saves almost 60 hours each month. We're automating 10 to 15 daily.

What is most valuable?

The most valuable feature of ITSI is the service KPIs. No other tool provides you with the same level of observability and enterprise security or the search and reporting applications. 

ITSI has everything. We can create searches, email alerts, and dashboards. It's the only application that offers the KPI concept where we can monitor different KPI parameters. We can configure the KPIs to trigger alerts when they breach a set threshold.

You can use the core concepts to optimize performance optimization. And you can create a lot of correlations and onboard the data from every project application. You can play with the data to create those KPI services and crash modes. It's possible to establish service health using the KPIs through the service analyzer. On a single screen, you have a lot of tiles showing you the service KPIs and high-level insights.  

When I started working on ITSI, there was some lag in releasing predictive analysis. Since then, there have been several updates, and we see that it works. We can predict any fluctuation in the data that might lead to failure. Using the historical data, we can set up the adaptive threshold. ITSI analyzes the historical data and sets an analysis for the future.

What needs improvement?

ITSI is an almost perfect tool, but there is room for improvement in a few features like the deep dive and multi-KPI alerts. We're using most of the features like service API, coding searches, and aggregation, but our team members hardly use multi-KPI and deep dive. We don't use the multi-KPI or deep dive because everything is available in the service KPI. I don't think this feature is necessary. 

People mostly use ITSI to monitor alerts. The most important features are within the service KPI. When we configure the alerts in service KPI, we don't need to do any deep dives because the client is more interested in the raw data, so we run the queries on the raw data instead of going into the deep dive. 

For how long have I used the solution?

I have used Splunk ITSI for seven years.

How are customer service and support?

I rate Splunk support nine out of 10. It is very helpful. Whether you are connected to priority one, two, or three depends on the issue and its impact. You can also get help from the Splunk community. If you create a P2 ticket, they will reach out to you within an hour and resolve the problem in eight hours. They have different SLAs. 

They might take one or two days to resolve issues. We need to upload the tags over the server to the portal. After that, they will start working on it. They have solved all the issues in the last four or five months within two to three days maximum.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We had Dynatrace. It was integrated to onboard the data and create correlation searches to monitor those parameters.

How was the initial setup?

Setting up Splunk ITSI wasn't difficult. A few files needed to be placed over the indexers, and a few more needed to be placed over the license master. I didn't have any issues installing ITSI from scratch. It takes 15 to 20 minutes, depending on the project. It can be set up with one to three people. When service KPIs are installed, we need to validate them after the installation and upgrade ITSI. 

Which other solutions did I evaluate?

My friend works with OpenSearch. They are moving from Splunk to Cribl and OpenSearch. Splunk is pretty expensive, but it gives you a decent insight into the data. It is easy to learn, and ITSI has a great interface. You can run those queries and pass the data. I don't find any product attractive, and we need to put more thought into it. 

What other advice do I have?

I rate Splunk ITSI nine out of 10. I have worked on multiple projects in the last seven years, and I've never found any product like ITSI. We can monitor everything through that. It's an excellent product.

Setting up and mapping the searches with the aggregation policies can be a little complex. Once you've mastered that, you can do anything with the ITSI. You can monitor the whole project infrastructure. You don't need any other tool to monitor and visualize the data. ITSI is enough.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
Senior Application Consultant at IBM
Real User
Top 20
Helps reduce alert volume, streamline our incident management, and adds reliability
Pros and Cons
  • "I particularly like the preview feature because it provides a prompt experience for impact analysis."
  • "Currently, Glass tables in ITSI only display metrics related to KPIs."

What is our primary use case?

I worked on multiple projects using Splunk ITSI for log monitoring, including monitoring mobile data usage for a telecom company, working with an insurance company and a retail application, and monitoring payment applications for a bank.

How has it helped my organization?

The integration with Splunk ITSI allowed us to monitor and track issues through alerts. This integration also reduces the Mean Time to Identify as the team is quickly made aware of problems through the ITSM tool, and respective incidents are raised to the application team. Depending on the issue's type, we can prioritize the incident, even giving it a P1 priority. With this, the team is made aware, and since we track our issues in ServiceNow, related incidents can be deployed, which also helps reduce the Mean Time to Resolve. The application team then knows what actions to take.

Event management utilizes event correlation and event aggregation instead of generating numerous alerts that cause panic within the team because multiple areas might be affected by a single issue. This can be achieved through Splunk's native capabilities, like notable event aggregation policies and episode reviews for ITSM, or by utilizing third-party tools such as Netcool. By employing event management tools like Netcool and then sending aggregated incidents to ServiceNow or using ServiceNow's item model for implementation, the number of alerts is reduced, and the troubleshooting team receives relevant information instead of overloading. This approach helps mitigate panic and provides the team with the resources to effectively address issues.

End-to-end visibility for application monitoring in our use case required us to consider all involved components. We addressed this by creating hierarchical dashboards. This approach provided everyone, from business stakeholders to operations, with visibility into application health through relevant metrics. Business stakeholders, for instance, focus on high-level metrics like application health, user experience, revenue, and performance rather than technical details like CPU usage. Therefore, we tailored the dashboard hierarchy for different roles: business executives, operation leads, project managers, and operations staff. The operations dashboard provided end-to-end visibility by configuring all components of the application's functioning. Leveraging the familiar network architecture, we utilized the same topology to present metrics, creating a comfortable and easily understandable dashboard layout. By plotting all entities with their availability and performance metrics, we achieved comprehensive end-to-end visibility.

We have set up the environment correctly for the predictive analytics, and our metrics are flowing continuously. We have the required data, so we can configure at least 30 minutes of lead time to predict the metrics and their thresholds for potential impact. I can set this up, but I only had the opportunity to work on the project until anomaly detection. Predictive analytics was not a requirement, so I did not implement it. However, I understand it entirely and have explored and learned about it in their documentation.

For our telecom project, we focused on promotions as a use case. We aimed to identify the most popular promotions among users, especially during festivals and special occasions. Analyzing business metrics revealed that Promo Code 350 was the most frequently used, generating significant revenue. We presented these findings to the business team, showcasing how different promotions performed during various events. This information empowered them to design more effective offers and strategies, ultimately improving the customer experience. The business team appreciated our contribution, recognizing the value of data-driven insights in shaping their marketing efforts.

Splunk ITSI is a tool that helps our clients streamline their incident management. By integrating Splunk ITSI with ServiceNow and NetCool, we can reduce the burden of keeping up with the number of incidents and ensure they're updated.

Splunk ITSI helps reduce alert noise. We receive multiple alerts for each event when using any APM tool, Splunk, or log monitoring tool. Aggregating these alerts has always been helpful, and we've utilized Splunk's notable event aggregation policy to reduce alerting for each KPI to a single episode review.

Splunk ITSI reduces our mean time to detect.

Splunk ITSI is resilient and highly capable of tracking issues, provided the necessary logs are configured. With proper configurations, metric values are obtained, allowing us to monitor KPIs and quickly identify any adverse effects. In such cases, we can seamlessly delve into the logs to pinpoint the exact root cause of the issue.

What is most valuable?

I enjoy designing glass tables, hierarchy dashboards, and the preview for ITSI. I particularly like the preview feature because it provides a prompt experience for impact analysis. We can directly track which specific service is impacted and identify the underlying affected entity. Also, we can quickly view the affected metrics. Overall, the Glass table preview is the most valuable feature.

What needs improvement?

Currently, Glass tables in ITSI only display metrics related to KPIs. I proposed adding an option to show metrics related to entities. This would eliminate the need for custom SPL to achieve this functionality. Since KPIs already have an entity split feature, extending this capability to dashboards makes sense.

For how long have I used the solution?

I have been using Splunk ITSI for five years.

What do I think about the stability of the solution?

I would rate the stability of Splunk ITSI nine out of ten.

What do I think about the scalability of the solution?

Splunk ITSI is scalable. It offers clustering for search indexes, and we have the deployment service.

Which solution did I use previously and why did I switch?

I previously used AppDynamics but switched to Splunk after learning about it and finding it more interesting.

How was the initial setup?

The deployment is straightforward.

What's my experience with pricing, setup cost, and licensing?

Splunk ITSI is expensive compared to other tools.

What other advice do I have?

I would rate Splunk ITSI eight out of ten.

Other APM tools have limited features, so I recommend Splunk because it allows you to go beyond pre-built functionalities. With Splunk, you can create custom rules for application monitoring and tailor data visualization for enhanced visibility. Splunk's flexibility extends to designing personalized dashboards and metrics, providing a limitless monitoring experience.

Splunk ITSI requires maintenance for upgrades either annually or biennially.

Splunk is a comprehensive solution that offers log monitoring and the ITSI observability suite, eliminating the need for multiple tools and the associated complexities in maintenance and cross-team coordination. Splunk's flexibility allows for adopting features like APM as needed and seamlessly adding further monitoring capabilities in the future, such as user experience monitoring, synthetic monitoring, or additional log monitoring. This adaptability, along with Splunk's ability to correlate data across different monitoring areas, makes it an ideal unified platform for comprehensive monitoring and observability.

Which deployment model are you using for this solution?

Hybrid Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
Officer at State Street
Real User
Enables us to consolidate tools but it should improve its pricing
Pros and Cons
  • "Alerts and episodes are valuable to me."
  • "The solution should integrate more features in NEAP."

What is our primary use case?

We use the solution to monitor our own internal applications. We monitor analogs and various other DB Connect sources.

How has it helped my organization?

The tool has replaced some other products in our organization. It’s coming in very handy.

What is most valuable?

Alerts and episodes are valuable to me. These features put all notable events together and give us an opportunity to take action.

What needs improvement?

We can take actions based on NEAPs, like emails and service now tickets. It is pretty basic at the moment. The solution should integrate more features in NEAP.

For how long have I used the solution?

I have been using the solution for about a year.

What do I think about the stability of the solution?

The solution is pretty stable.

What do I think about the scalability of the solution?

The product is extremely scalable.

How are customer service and support?

I work with a lot of Splunk’s support people. I like them. They're all good.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We were using a software called Genius. We use Splunk IT Service Intelligence now, and it's more cost-effective overall.

What about the implementation team?

I have been maintaining the solution. The product is straightforward to maintain. We just need to follow the best practices, and it works. We have a lot of users, so it's difficult controlling what the users do in the environment.

What was our ROI?

The tool is a centralized place to collect all our data and compute against it. It has the potential for an ROI.

What's my experience with pricing, setup cost, and licensing?

Pricing has some room for improvement.

Which other solutions did I evaluate?

We evaluated other options, but Splunk seemed to be the best. It is the industry leader, so it was a no-brainer.

What other advice do I have?

We have an on-prem instance. Everything's pretty much on-prem. We work with cloud logs. Monitoring multiple cloud environments using the solution is pretty straightforward and easy. It is extremely important to us that the solution has end-to-end visibility into our cloud-native environment.

The solution has helped reduce our mean time to resolve. The product has helped improve our organization’s business resilience. Its ability to predict, identify, and solve problems in real-time is pretty good as long as the source is good and we use it well.

The tool’s ability to provide business resilience by empowering staff is alright. We have experienced cost efficiencies by switching to Splunk IT Service Intelligence. I know it used to be ingestion, and now it's like a CPU. It's always evolving. I was not involved in the initial setup. The solution still has some room for improvement.

Overall, I rate the product a six or seven out of ten.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user