We use the solution for infrastructure monitoring.
Director at a computer software company with 1,001-5,000 employees
The product is very stable and scalable, but it should improve its price
Pros and Cons
- "It is a very stable product."
- "The solution could improve its price."
What is our primary use case?
What is most valuable?
It is a very stable product.
What needs improvement?
The solution could improve its price.
For how long have I used the solution?
I have been using the solution for some time now.
Buyer's Guide
BMC TrueSight Operations Management
November 2024
Learn what your peers think about BMC TrueSight Operations Management. Get advice and tips from experienced pros sharing their opinions. Updated: November 2024.
816,406 professionals have used our research since 2012.
What do I think about the stability of the solution?
I do not see any complaints about the tool’s stability.
What do I think about the scalability of the solution?
The tool is scalable. A small team of two to five people manages the product in the organization. The number of users in operations depends upon the size of the environment. When we operate, we need a team to monitor and take action. It is a completely managed service.
How are customer service and support?
I haven’t received any complaints from our customers regarding the technical support team.
How was the initial setup?
The setup is okay. It is not complex. The time taken for deployment depends on the project and the size of the customer.
What about the implementation team?
To deploy the solution, we need to install agents, install the software, ensure that the servers and the prerequisites are ready, and ensure that the setup related to integration to the endpoints and security is ready. The problem is not in the setup and installation of the product. Usually, the problem is related to the infrastructure and facilitating access to the product through security procedures. The administration of the solution is easy. A couple of people can manage it.
What's my experience with pricing, setup cost, and licensing?
The pricing depends on the requirements of the customers and the additional functionalities they need. The product is suitable for enterprise customers, not for SMEs. The tool is very flexible with its pricing. I rate the pricing a six out of ten.
What other advice do I have?
We require market standards to understand the improvements needed in the product. Overall, I rate the solution a seven out of ten.
Which deployment model are you using for this solution?
On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Service Delivery Manager at a financial services firm with 1,001-5,000 employees
Knowledge Modules are what make the implementation across our varied infrastructure, but RBAC controls need some work
Pros and Cons
- "From an administrative standpoint, what stands out in TrueSight is the ability to implement quickly. When they have a requirement to monitor something, we're able to turn that on quickly in their environment. We're able to set up new apps within a day."
- "We were somewhat limited in TrueSight due to some of the RBAC controls not quite being what we wanted as far as delegating out administrative privileges for implementation. But because we were able to turn requests around pretty well, that burden wasn't too heavy."
What is our primary use case?
We use it for business service and infrastructure monitoring. We use the full gamut of utilities from them and monitoring in the platform.
How has it helped my organization?
We don't use APM. We used to. We line-item nixed that for various reasons a few years ago. We also don't use the ITDA, their next-gen log monitoring tool. So we're truly just within the TSOM interface, as well as doing synthetics. That being said, the Knowledge Modules that BMC brings to the market are what make the implementation across our varied infrastructure and applications. It's critical to have those Knowledge Modules. If we had to write things ourselves, or to use a more generic monitoring environment, and then build additional scripts on top of that to monitor the Kubernetes of the world, or the WebLogics of the world, or the Oracles and SQLs of the world - if we had to write scripts ourselves to bring back particular monitoring components and performance metrics and so on - that would be a heavy burden that would keep us from implementing. We don't often run into something that we haven't been able to monitor. It's just a matter of getting people to the table to tell us what they need.
When it comes to incident management, we get most of our data from TrueSight, log data, because we don't use the ITDA interface. It would be an effective interface, but for logging we go to our SIEMs, since we're already pumping data to another system there. But TrueSight definitely gives us a view into the health of our business services, which is our primary goal for implementing monitoring.
We try very hard not to use event management. What I mean by that is that we do not have a typical NOC. We don't have ten people staring at screens and then escalating as necessary. Along those same lines, we don't spam our incident management environment with events from TrueSight. With a lot of customers I've met over the years, that's essentially the old school way of doing things. Instead, we create events that are truly actionable. If we don't have an actionable event, we don't create it. We use their baseline technology to ensure that we're only sending items that are either about to have a problem or have passed the threshold of having a problem. If you're talking about typical event management, where you create an event and it gets forwarded to some other system, there's a notification about it somewhere else - the whole ITSM cycle - we don't use it for that. We use it for creating smart events that create alerts directly to the teams responsible. As I described before, we have many distributed teams rather than a centralized NOC.
In terms of TrueSight helping to maintain the availability of our infrastructure, it's an interesting question because of our distributed systems. We have 8,000 hosts across about 40 different teams, and we have 600 different applications that we run. For those critical tier-one apps, teams are highly involved in their day-to-day operations and watching them very closely. Having those two things - the actionable alerts and the ability to see what the health of their system is at any given time, and to be able to check it against what normal looks like for those applications - gives the teams that use it in such a manner the information they need to be confident that their availability is as it needs to be, or better. As far as a hybrid environment goes, we have our own hosting environment because we are the cloud to our clients. So we're not necessarily in that situation. We don't use assets other than what's in our hosting environment.
If, in the past, one of our biggest problems was just plain old infrastructure incidents, basic availability incidents where a server or an application, an interface or an endpoint, may not have been available and no one noticed it until some downstream, business end-result brought it to our attention, we've essentially eliminated 90 percent or more of those. It has been at least three years since we've done any numbers. But at the time, we might have had ten to 15 Sev-One incidents a month. When we last measured it, we were down to one. That was within a couple of years of implementing an enterprise monitoring strategy.
As for root cause, when a team is engaged in monitoring to its full extent, we're usually able to get to root cause pretty darn quick. For example, if a team has many servers that could potentially be impacting an application or a business service, tracking something down across those multiple servers and multiple owners could be really tedious and time-consuming. It would be on the order of hours, or at least many minutes, depending on the scope of the issue. With well-implemented monitoring, for our Sev-One apps, they're able to get to the solution almost immediately. If we have monitoring set up properly, the actionable event will tell them precisely where a critical component has failed and they can resolve it. Where it's a different type of incident that we might not have a particular monitor for, they're able to use the performance data, availability data, and other related alerts to get to their issue much faster than they used to. Having a good monitoring implementation has made a world of difference to our operations teams. It's so much so, that if you think back five years, which is an eternity in the IT world, when there was a Sev-One incident back then, someone would walk around tapping people on the shoulder all over the floor. That was very time-consuming. But now they're able to collaborate quickly and say, "It looks like this is the problem right here," in a well-monitored environment, and get right to the root cause.
It's helped our mean time to remediation, and I'm being conservative here, by about 70 to 80 percent. That's an absolutely huge impact.
What is most valuable?
We have many operational teams, and for any given team their requirements are different. One team is more reliant on infrastructure monitoring, because they are processing-heavy. Another team might be more reliant on endpoint monitoring where we're ensuring that the third-party endpoints they rely on are up and available. Another team may have fairly immature applications, so that they would rely heavily on log monitoring to catch all the errors that may come up. From a consumer-function standpoint, there isn't any feature that stands out. They're all important because all of our consumers are important.
From an administrative standpoint, what stands out in TrueSight is the ability to implement quickly. When they have a requirement to monitor something, we're able to turn that on quickly in their environment. We're able to set up new apps within a day. Most of the work in monitoring is working with the teams, evangelizing, educating, and making sure that they're bringing their smart requests to the table so that they get visibility into their business service. If the implementation wasn't as easy as it is, it would hinder and probably decrease the adoption of monitoring. But because we can turn requests around pretty quickly and adjust things as teams need adjustment for their different release schedules, administratively, we're able to respond and keep pace with the business and the technology that they're implementing. That is a critical function for us.
For how long have I used the solution?
We've been using TrueSight Operations Management for almost six years.
What do I think about the stability of the solution?
Stability is one of those areas of identifying challenges with TrueSight, areas that I'm not entitled to share at this point.
What do I think about the scalability of the solution?
We've been able to implement all the hosts that we care to implement on a couple of servers, with minimal maintenance. We don't use their high-availability solution. We don't really require it because the underlying infrastructure is relatively robust. We haven't had any problems with the scalability. Had we been a couple of times larger, there would've been more to implement server-wise.
The other thing about our implementation is that we send a lot more performance data to our implementation of TrueSight than the typical BMC environment might. We send everything server-side for analysis rather than keeping everything agent-side or emphasizing agent-side, as I've seen a lot of other clients do. I think the tide is turning. I think more people are doing what we're doing where we just push all the data for potential analysis. But we've been able to accomplish what we need without too much infrastructure.
How are customer service and technical support?
They had an advisory board. We, as a group, and even I specifically, had been asked by them what they needed to continue doing. One of those was continuing to build out Knowledge Modules in various technologies. Some of the ones BMC has made available, we've implemented, and some of the ones BMC has made available don't impact us and we haven't implemented. But I've been in discussions where they say, "What do we need to do," and Knowledge Modules is one of those areas where they've made a commitment to continue adding to them, and we appreciate that.
Which solution did I use previously and why did I switch?
When we first started, we did not have a monitoring program at anything resembling an enterprise-type level. We were at about 4,000 hosts and we were really not monitoring anything except for a few services. At that, it was bare-bones monitoring. We monitored, maybe, half of our environment at bare-bones.
We went on this journey six-plus years ago to have an enterprise monitoring solution that focuses on business services. One of the reasons we did that is because of the number of incidents that we had that really should never have happened. Now that we're a number of years in, and we've implemented monitoring and brought teams around in the direction of business service rather than just an executable's use of a CPU, we have much fewer incidents.
As a general trend, we're much more capable of seeing what's out there and monitoring what our issues are and taking care of it before the business incident occurs. I don't have any particularly recent examples where our monitoring was able to resolve an incident after it happened. Of course, I don't get notified when people say, "Oh, look, I resolved this," because it's part of their daily operations to find an issue and resolve it. So it's not necessarily a newsflash anymore for us.
It doesn't happen quite as frequently as it used to, but they continue to build Knowledge Modules, every time there are new products on the market. They need to create Knowledge Modules for the implementation to be enhanced. That's one of the key features of the Operations Management. That's definitely something that helps us take advantage of everything BMC has. They're not sitting on their laurels. They're building things out.
How was the initial setup?
The complexity of our environment demanded the complexity of the implementation. More than half of the effort that we had in implementing monitoring was based on the way we did our program. We were basically starting at zero and bringing teams up to speed, evangelizing, educating, getting people onboard.
The implementation of TrueSight itself was just a software implementation. It had its bumps and bruises. None of us were versed in BMC software. There were some learning curves as would typically be expected for any application of this scope, magnitude, and impact.
We had an overall strategy of doing proofs of concept for various, widespread technologies. We took that success and did a wide-to-narrow type of advertisement. We told everybody what was going on and then we brought more specific people into the room and said, "These are good targets for you to implement." During and after that evangelizing and advertising, we started implementing tier-one applications as an onboarding effort. We did that in a deep-dive fashion where we would sit down and interview these teams and really come to understand what makes their business service tick. A lot of our evangelization effort was actually in changing the focus of operations teams to think from a business service perspective. That paid off in dividends later when people were more interested in monitoring the actual functions of their applications rather than just the infrastructure of their application. We've been able to change mindsets over the course of a number of years. The first two or three years we were doing implementations. That was when we did most of that work.
From there, we worked as much as possible to allow folks to implement their own where possible, rather than centralizing it, so that people could keep up with their own demands. We were somewhat limited in TrueSight due to some of the RBAC controls not quite being what we wanted as far as delegating out administrative privileges for implementation. But because we were able to turn requests around pretty well, that burden wasn't too heavy.
From tier-one apps, we kept going and kept educating, bringing people to the table. When new applications come to our company, we still reach out and educate new teams, bring them to the table and use the onboarding process we built and solidified over the course of the first couple of years.
During the first three years, we had two-and-a-half FTEs for implementation. That was for the full program, not just the TrueSight component. It included all those interviewees, all those educational components, all the training, etc. The full program. The actual pressing of the buttons was about half of that. Once you stand it up and start connecting things, it's a matter of administratively using the tool to execute.
What about the implementation team?
Typically, our company builds knowledge for implementing infrastructure/operations activities like this from the ground up. We did not use a third-party. BMC was instrumental in our success in that they made resources available to us, implementation-wise as well as development- and support-wise.
What was our ROI?
The solution hasn't helped reduce costs in a measurable fashion. That's a measure that we wouldn't undertake. There might be soft costs benefits, such as
- impact on the quality of life for operations folks
- our ability to show our clients that the services we provide to them are healthy
- giving the business teams, our relationship teams, the ability to speak intelligently, rather than just colloquially, about how our systems are running.
Life at our company as an operations person is nicer now because you have confidence that what you're doing makes a difference, that the business service that you're working on is healthy. The business is happier when we're able to talk to them intelligently and say, "I can actually show you that we've been up and successful."
It has helped in our ability to work on smarter things rather than silly incidents. If we eliminate incidents, then we're doing better work. We're able to do the good work of business rather than the sad work of recovery. That's not only quality of life but it's also the ability to get things done. So I know that, at some level, we're doing more with less because of our monitoring. But we don't have any hard numbers from a monitoring perspective.
What's my experience with pricing, setup cost, and licensing?
We're end-of-lifeing it now. Overall, the licensing costs of BMC are a challenge for us in that they're hard costs, whereas open-source monitoring has soft costs, where it's harder to line-item. It's harder to see the cost of implementation for other things. So that change of direction is taking place. It doesn't mean the cost isn't there; it's just soft dollars rather than hard dollars.
Which other solutions did I evaluate?
We looked at Microsoft SCCM. And, because we had a partnership with CA, we looked at their tools. There were a couple of other minor players we looked at which just didn't have the scope of what we needed to do, because of the breadth of technologies that we use. In the bakeoff, we came down to BMC and Microsoft.
It was a long time ago, so I don't know that it's fair to judge at this point, but from a monitoring perspective, the whole Microsoft suite really wasn't there. There was a lot of scripting. It was easy to identify that the administrative burden was going to be high in that implementation. Conversely, with the BMC stuff, out-of-the-box, administratively, you click and implement. That is one of our components of success, our ability to implement quickly.
On the soft side, BMC as a partner was much more interested in our success than the Microsoft folks were at the time. It's very hard to quantify unless you're there sitting in front of them at the table and working with them, consuming their knowledge. It really is a great partnership.
What other advice do I have?
BMC is at a critical point in redefining TSOM, how it's built. Anybody looking at BMC now needs to jump on the new version of TSOM and skip the current versions. I would wait until their new environment is ready. It will be containerized. Anyone implementing BMC can get used to the environment in a PoC but they shouldn't implement until their new stuff is out. I expect it to be that much different.
Make sure that you have stakeholder buy-in and that they are able to provide the resources with the correct knowledge to implement in a smart fashion. Everybody's definition of "smart" is going to be slightly different. We really hone in on the business service side to make sure that our business functions are healthy and that we're able to understand what's normal and what is out of normal. We work with the teams, even from the point that they're in development of projects, to make sure we're ahead of what's going on rather than reactive. But that means the buy-in of multiple teams: development, operations, support. That amount of effort requires stakeholders with decision-making capabilities to say that it's a priority for them.
We knew up front - and we've been able to validate our assumption - that monitoring doesn't do any good unless you are analyzing your business service for what are the critical components to observe. That's an educational effort and an implementation project. It's that upfront effort that will make your monitoring successful. Where we've been able to engage teams and teams have remained engaged, we've been the most successful in that. We took that to heart upfront, we made that part of our route to success, and we put the effort in. Our monitoring's been successful because of that. If we didn't do that, and we didn't constantly engage teams to make sure that they were aware of capabilities including the ability to give us feedback, and that we can implement quickly, we wouldn't be here. We wouldn't have advanced as far as we have. Most of that advancement was in the first two or three years, and we've just been riding that wave of success since then.
Keep in mind that most companies don't go from nothing to an enterprise monitoring solution; they go from one monitoring solution to another. But if there's anyone in the boat that we were in, where they are the size we were with no monitoring solution, they'll be in the pain that we were in. Implementing a good monitoring program, not just the tool, but a program around it, can make a world of difference to the operations teams, and subsequently to the business as well.
For those teams that are utilizing TrueSight, they don't rely on other monitoring environments. Some of those teams rely on those actionable alerts almost exclusively, and don't really use TrueSight's single pane of glass. We do have some teams that consume TrueSight and use it on a daily basis to ensure that they don't have any events, whether or not they've risen to the level of action. They'll also proactively look at some components, either business function components or infrastructure components, to ensure that they're working as designed and within the parameters of normal.
I don't think the functionality of Operations Management helps to support our business innovation. Our business runs forward and headlong into innovation, regardless of whether or not IT can keep up. We were never an impediment, other than cost. The way we run our overall IT environment is very open and flexible. Monitoring is a way for us to give business the confidence that what we're implementing is healthy, but it doesn't impact their interest in being able to implement what's new. They've always been able to do that and continue to be able to do that.
In terms of machine-learning, I mentioned above the baselining which, depending on how it's implemented, might be called machine-learning, but in TrueSight they just have a straight calculation-type of activity. We have other monitoring solutions that we're implementing as well, and that topic may be more applicable to them, but not in the TrueSight world. The TrueSight world is a straight application implementation. It's nothing exciting on that end.
I have to give our BMC partners a lot of credit for where they're planning to take TrueSight based on their roadmap, although it is speculative. I don't think the areas for improvement from us would be any different than anything they've already heard.
If someone were to implement the full suite of BMC products, you'd have to give it a nine out of ten. TSOM by itself, I have to give it a seven out of ten.
Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
Buyer's Guide
BMC TrueSight Operations Management
November 2024
Learn what your peers think about BMC TrueSight Operations Management. Get advice and tips from experienced pros sharing their opinions. Updated: November 2024.
816,406 professionals have used our research since 2012.
Vice President & Advisor - Compliance at a financial services firm with 5,001-10,000 employees
Excellent standalone solution with high availability
Pros and Cons
- "I like everything about this tool. I recommend this solution to anyone looking for a standalone solution with high availability meaning that can be used depending on the customers requirements."
- "There are some small limitations with this tool in terms of reporting dashboards that fit all of the requirements of the individual customer."
What is our primary use case?
I am a certified TrueSight Operations Administrator where I monitor and implement BMC products. This solution is used to monitor various software infrastructures (i.e. servers, databases, hardware, etc.).
What is most valuable?
I like everything about this tool.
What needs improvement?
There are some small limitations with this tool in terms of reporting dashboards that fit all of the requirements of the individual customer.
For how long have I used the solution?
I have been using this solution for the last ten years.
What do I think about the stability of the solution?
This is a stable solution.
What do I think about the scalability of the solution?
This is a scalable solution.
How are customer service and support?
There are some troubleshooting steps that we are able to resolve ourselves. In the event that we are unable to resolve it, we simply just raise a case with BMC support and that are always there to help if necessary.
How was the initial setup?
This is a straightforward solution all around and there are three ways that you are able to install it: a silent installer, a command-line installer, plus Linux OS and Windows installers.
Depending on the project requirements, basic installation takes about five days for standalone setup. In the event that there is an HA setup that needs to be taken, an additional five or so days can be added to that time.
We have implemented this for a bank that has their entire infrastructure monitored by BMC and they have about five thousand users.
What about the implementation team?
We use out in-house team to implement the solution for our clients. We have a team of three people for maintenance of the tool.
What's my experience with pricing, setup cost, and licensing?
Annual licensing amount depends on the customers requirements. Support is an additional fee and there are options for three and five year support.
What other advice do I have?
I recommend this solution to anyone looking for a standalone solution with high availability meaning that can be used depending on the customers requirements.
I would rate this solution a nine out of ten.
Which deployment model are you using for this solution?
On-premises
Disclosure: My company has a business relationship with this vendor other than being a customer: Implementer
IT Operations Monitoring Specialist at a tech services company with 51-200 employees
Provides visibility to our infrastructure, how it is, the resources we are monitoring, and quick updates when it has any problems
Pros and Cons
- "The solution provides visibility to our infrastructure, how it is, the resources we are monitoring, and quick updates when it has any problems. We have integrated it with ServiceNow to open instances."
- "The dashboards are not good. We have a limited dashboard, and if we want better dashboards, we need to use other solutions like Grafana because the TrueSight dashboards are not good."
What is our primary use case?
We use the solution to monitor a vast infrastructure, including operating systems, maintenance, Windows, services, processes, applications, and databases. Therefore, we have integrations with monitoring products such as Microsoft, Cisco, and SaaS solutions for management.
How has it helped my organization?
The solution provides visibility to our infrastructure, how it is, the resources we are monitoring, and quick updates when it has any problems. We have integrated it with ServiceNow to open instances.
What is most valuable?
The important feature is device management.
What needs improvement?
The dashboards are not good. We have a limited dashboard, and if we want better dashboards, we need to use other solutions like Grafana because the TrueSight dashboards are not good.
TrueSight could add any new resources because everything is changing to BMC Helix and will be discontinued.
Some points didn't evolve. We are still using the node architecture, a node type of agent, and a decent cell, which was created many years ago.
For how long have I used the solution?
I have been using BMC TrueSight Operations Management since 2009. We are using V11.3.05 of the solution.
What do I think about the stability of the solution?
The solution is a very stable solution. After you get everything done, take some time to get the recording done.
I rate the solution’s stability a nine out of ten.
What do I think about the scalability of the solution?
Scalability depends on the number of servers dedicated to the solution and infrastructure management service; you install it from any integration server.
I rate the solution’s scalability a seven out of ten.
How was the initial setup?
The initial setup is not that easy. It can depend on the environment you are deploying. It could be tough to do. It took a few months to get everything ready.
It takes many servers. One server has a central console that will present the data for all the components.
I rate the initial setup a five or six out of ten, where one is difficult and ten is easy.
What about the implementation team?
Deployment was done in-house.
What's my experience with pricing, setup cost, and licensing?
The product is expensive, depending on the types of monitoring you have. You need to acquire more licenses.
I rate the product’s pricing an eight out of ten, where one is cheap, and ten is expensive.
Which other solutions did I evaluate?
We like the features that are presented to us, such as remediation and remote actions. It is possible to customize the agents and consult the graphs from the devices
What other advice do I have?
I advise you to find someone with experience before entering the space.
Overall, I rate the solution a seven out of ten.
Which deployment model are you using for this solution?
On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Last updated: May 11, 2024
Flag as inappropriateSolution Architect at Tech Mahindra Limited
Incident tracking solution that allows service providers to log tickets for their products
Pros and Cons
- "Helix Innovation Studio is a very good feature. It allows us to develop our own enterprise applications and make them available for the customers."
- "The UI for the end users could be improved and more flexible than it is now."
What is our primary use case?
ITSM is used for incident tracking. We have deployed it for one of our customers that is in the telecom domain. Service providers are using this tool to log tickets related to the product they are using from the main organization. They log tickets in case they're facing some issue with the services they're using or providing to their customers.
Apart from that, we have implemented the end-to-end change management life cycle and the knowledge management module as a self-service tool. Basically in all ITSM modules, we are delivering solutions to the customer.
I'm currently working with product version 21.3. It's deployed on-prem.
What is most valuable?
Helix Innovation Studio is a very good feature. It allows us to develop our own enterprise applications and make them available for the customers.
What needs improvement?
The UI for the end users could be improved and more flexible than it is now. If it becomes more flexible, it will be the best interface among all ITSM centers.
For how long have I used the solution?
I have been using this solution for about 12 and a half years.
What do I think about the stability of the solution?
It's very stable.
What do I think about the scalability of the solution?
It's highly scalable.
How are customer service and support?
Technical support is very good.
How was the initial setup?
It's straightforward. For complexity, I would rate it 1 out of 5.
What other advice do I have?
I would rate this solution 10 out of 10.
Which deployment model are you using for this solution?
On-premises
Disclosure: My company has a business relationship with this vendor other than being a customer:
Enterprise Monitoring Automation Engineer at a healthcare company with 10,001+ employees
Allows our operations team to have one single application to reference when investigating issues in our environment
Pros and Cons
- "It allows our operations team to have one single application to reference when investigating issues in our environment."
- "Signature baselines, which have allowed us to fine tune many of our events and significantly reduce the number of events generated."
- "I would really like to see out-of-the-box support for monitoring uninterruptible power supplies."
What is our primary use case?
We utilize BMC TrueSight Operations Management to proactively monitor all of our physical and virtual server environments. Coupled with Entuity for TrueSight Operations Management, we can have a holistic view of our Network and Server environments health in a single pane of glass.
How has it helped my organization?
It allows our operations team to have one single application to reference when investigating issues in our environment.
What is most valuable?
Signature baselines, which have allowed us to fine tune many of our events and significantly reduce the number of events generated.
What needs improvement?
I would really like to see out-of-the-box support for monitoring uninterruptible power supplies.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Performance Management Consultant with 51-200 employees
Introducing the BMC BPPM 9.5 Central Monitoring Admin Policy Console
BMC Patrol Agent Configuration Automation using the (TrueSight) BPPM Central Monitoring Administration Console (CMA)
Have you ever been frustrated to discover that your monitoring failed because one of your Patrol agents isn’t configured correctly? After you investigated you were told that someone sent you an email or called and left a voice mail, telling you it some set of systems was ready for monitoring, and you didn’t get them. Everyone knows how adequate email and phone messages are right?
Communication breakdowns involving your Patrol Agent infrastructure are nothing new. They’ve been around for many many years. I know them very well. Everyone is very busy, and that only compounds the problem. There are so many things that can go wrong with keeping all your agents configurations in sync and up to date. Wouldn’t it be nice if this could all be automated somehow?
There is a new ability you need to be aware. The BPPM 9.5 Central Monitoring Administration (CMA) Console. The CMA was introduced with BPPM 9.0, but it wasn’t flexible enough to be useful in very many situations. One of the key features in this new release was the Policy Management interface. Although useful, its ability to truly manage your Patrol Agent infrastructure outside of Patrol Configuration Manager (PCM) was very limited. Well, that all changes with CMA 9.5.
With the release of the 9.5 BPPM CMA Console, and the greatly expanded Policy capabilities, you’ve never been so close to real-time Patrol Agent configuration automation. Say hello to your new little friend, the BPPM CMA Configuration Policy.
http://advantisms.wistia.com/medias/nvn9c6862k?emb...
BPPM Agent Configuration Policies – A Brief History of the BPPM 9.0 CMA Introduction
BPPM 9.0 introduced configuration policies for the first time with the CMA. A CMA Policy is suppose to replace the need for manually deploying configuration settings using Patrol Configuration Manager (PCM). Unfortunately, with the 9.0 policies you had little choice with respect to the policy “selector criteria”. The selector criterion is the mechanism that engages the CMA Policy.
You were able to specify the use of one item, the BPPM Tag, as the policy selector, which meant that you had to create a separate Policy and BPPM Tag for every possible scenario.
If you worked with the CMA in version 9.0, you know first hand how limited that was. Chances are you looked at it, scratching your head, and moved on.
The 9.0 CMA release allowed you to deploy a simple Policy with three configuration options: Monitor, Threshold and Server Policy Configurations. CMA 9.0 made these three administrative options available for the first time but the overall policy capabilities were limited and ultimately became more work to manage than continuing to use PCM. They’ve been greatly expanded with version 9.5.
The BPPM CMA 9.5 Brings Patrol Agent Configuration Automation
With the release of the 9.5 BPPM CMA Console, the Policy capability features available grew from three in version 9.0, to a total of nine.
The additional features include seven total monitoring Configuration Policy options, one blackout option and one staging Policy option. Nine in all, compared to only three before. And the Policy “Selector Criteria” specifications, the item(s) which engages the Policy, has gone from one, the BPPM Tag, to eight. The new added diverse selector abilities allow for creating simple, or very complex activation condition now. With all of those new features, CMA 9.5 allows for dynamic automation of your Patrol Agent configurations like never before.
Here are the 7 New BPPM 9.5 CMA Policies and a description of they can be used.
Monitoring Configuration – You can use this feature for filtering or turning the monitoring configurations off or on, based on your selectors. In the associated webinar, I construct one of these policies as an example, showing how they can be used to disable a specific monitor, for a specific OS, running in a specific environment.
Filter Configuration – This is a helpful addition to CMA 9.5. Filter Configuration allows you to specify what monitoring data is not meant to go into the BPPM database. With this new feature, you can specify the attributes and parameters that you want to stream into the BPPM console and see, without storage in the database.
Agent Threshold– This policy allows for setting traditional monitoring thresholds at the Patrol Agent Level. It allows you to specify the alert threshold settings you use to set and deploy within PCM or from the Patrol Console, down the agents. These can now be set, and take effect as soon as the agent checks into the BPPM infrastructure.
Server Thresholds – These thresholds are set at the BPPM server level. You can set Absolute, Signature and Intelligent thresholds within a policy based on the same selectors as the lower agent level.
Agent Configuration – This new policy has several capabilities. It allows for setting up Agent specific settings like the Default Monitoring account. You can also use this feature to specify Polling Intervals for the Patrol Knowledge Module (KM) Collectors. The KM Collector gathers the information at polling intervals, and depending on how you construct the selectors, you can now change these intervals within the CMA console now, outside of PCM.
Server Configuration– This feature is ideal for the policy options in Groups within the BPPM Operations Console. For example, if you have servers associated with an application named, “NewApp,” you can use this policy to group all the servers in one location within the Operations Console. By deploying a tag, “NewApp” to all the involved systems, the Patrol Agents check into BPPM, see the policy and automatically add the servers to the group you specify. If the group doesn’t exist, it will create it and place all the NewApp systems within that group for viewing, automatically.
Configuration Variables – This last option allows for the manual creation of any agent configuration variable you want or need that can be used by the agent. But the key feature of this one is in the ability to import your existing PCM configurations.
This new CMA brings real automation into the daily maintenance associated with your Patrol Agent infrastructure. Quit playing phone and email tag with your system and application administrators and see how to put this to work right now.
To see this new CMA Policy in action, be sure to check out this hands-on video introduction.
http://advantisms.wistia.com/medias/nvn9c6862k?emb...
To read about and see the CMA put a Patrol Agent Blackout into action, check this out.
Putting the BMC Blackout Policy to Work
To read about and see the CMA handle the Patrol Agent event streams and give you a brand new, centrally focused Event Management mechanism, check this out.
Simplified Patrol Agent Event Management
New Update!!
How to automate New Patrol Agent Package Deployments with CMA Policies. I'll show you step by step how to use a CMA Policy to automatically baseline your new Patrol Agents the moment they come up on the network, using your existing PCM configurations.
Automating The Configuration Deployment of Your New Patrol Agent Builds
To read more about (TrueSight) BPPM 9.5, be sure to check out the blog on the topic located here.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
BMC TrueSight & PATROL Consultant at World Opus Technologies
Before implementing consider: Scalability, High Availability, Implementation Repeatability and Standardization
BPPM Implementation Considerations
Part 1: Meet your business requirements
Three years after BMC ProactiveNet Performance Management (BPPM) is
released, now most BPPM customers reached a conclusion that BPPM
implementation is more than just software installation. But what make a
BPPM implementation a successful one? What do you need to consider
before diving into installation details?
"BPPM Implementation Consideration" blog series will try to address
several important considerations at requirement level and architecture
level. Implementing BPPM is a lot like building a house. Many
considerations at requirement level and architecture level are like the
foundation of the house. They need to be determined at the very
beginning.
The most important consideration in BPPM implementation is your business
requirements. The management of your organization, your entire
implementation team, and other stakeholders should have a clear
understanding on a list of business requirements that your BPPM
implementation is expected to meet. Then you will need to translate
this list of business requirements into a list of technical requirements
with a category assignment such as mandatory, strategic, cost-saver,
and nice-to-have.
Only now you can map each technical requirement into a list of detailed
BPPM features and prioritize the implementation of each feature. This
will become your project scope. Based on your project scope, you can
plan your project timeline and budget. If you outsource your BPPM
implementation to a consulting company, it is critical that you do your
homework on your business requirements and technical requirements first.
Then work closely with the architect (not just the project manager) of
the consulting company to determine the project scope.
However many new BPPM customers I have talked to seem to do it
backwards. They came up with a budget first without knowing exactly
what BPPM features to implement and how long the implementation will
take. Then they picked up a list of BPPM features to implement from
product datasheet without knowing how each feature relates to their
business bottom line.
As an example, here is the process taken at one of my past clients. One
of the top business requirements was to cut down the cost on Remedy
Gateway licenses from multiple monitoring software vendors. This was
translated into a technical requirement like this: Alerts from multiple
monitoring software must be integrated into one alert management tool to
communicate with Remedy for ticket creation. This requirement was
categorized as cost-saver. This technical requirement was mapped into
these BPPM features: Event to BPPM cell integration through API and SNMP
traps, msend API installation, SNMP trap adapter high-availability
implementation, custom BPPM cell MRL rules to process events from
multiple vendors, IBRSD high-availability implementation, and event to
ticket categorization in BPPM cell. The return was a 6-figure annual
license saving year after year with an investment of 5-figure consulting
fee. This ROI went straight to help business bottom line.
Part 2: Keep the total cost of ownership in mind
When you build a house for yourself, you don't just consider the cost of
building, you also consider the cost of maintaining the house and
utility bills when you live there. Similarly when you implement BPPM,
in addition to implementation cost, you also need to keep the total cost
of ownership in mind.
After talking to several BPPM customers, I noticed that they all have at
least twice the size of the operations team comparing to the team at my
clients just to keep BPPM operations going. What is worse is that
their operations team also need to have the implementation skill set to
constantly patch up the implementation.
Before you even start implementation, consider the following aspects:
1) Scalability: When your environment grows with more servers, more
applications, or more integration, will your architecture still work?
How easy would it be to split horizontally (based on processing steps)
and vertically (based on incoming traffic)?
2) Upgrade: What can you do right now to make future upgrade easier?
You may want to consider having a name convention, saving configuration
in a separate repository, and documenting everything consistently.
3) High Availability: High availability not only helps with business
continuity, it also helps your team from constantly fighting fire. You
have several options in high availability: Application level failover,
OS based failover, active/active load balance, or duplication. Which
option would best fit your needs for each BPPM component and how much
would it cost? For example, a native application level failover might
be your best choice for BPPM cells if your business cannot afford to
miss a server down alert. But a simple duplication of PATROL 7 console
is probably sufficient for you comparing to OS based failover which
would cost nearly twice as much.
4) Implementation Repeatability: Do you keep an accurate implementation
document so that installation and configuration of each BPPM component
is repeatable? You need to implement everything on a test system first
and carefully document everything as you go. Production deployment
should be a straightforward 'follow the doc' process. It also gives you a
perfect opportunity to update the implementation document for anything
you have missed.
A common mistake I have seen is to start the implementation directly on a
production system. After several months of figuring things out, it
finally went live with many junk files sitting under the implementation
directory. Then you realized that you actually needed a test system
because you won't be able to make and test changes otherwise. Now you
don't know how to configure your test system to make it identical to
your production system since you have lost track on what made the
production system work and what did not.
5) Operations Standardization: Do you have a standard operations
procedure document? For example, if a new server is added into your
PeopleSoft Payroll application, do you have a document containing the
steps for the operations team to add that server to PATROL, BPPM
integration service, BPPM cell, BPPM server, BPPM GUI, and automated
Remedy ticketing?
Part 3: Achieve the highest ROI through integration
In addition to monitoring solutions from BMC, most enterprises nowadays
also use monitoring software from other vendors, open source, and even
home-grown scripts scheduled by cron job. Having a group of NOC
operators watching the GUIs of all monitoring software in a NASA-like
environment is simply not efficient. What is worse is when you have to
pay the license fee for each monitoring software to connect with the
back-end ticketing system.
BPPM/BEM cell provides extremely flexible and robust API and adapters to
integrate with just about any monitoring software out there. Whether
you are running monitoring tools from other commercial vendors such as
IBM and Microsoft, or you use open source tools like Nagios, it is
fairly straight forward to integrate alerts from these tools into
BPPM/BEM cell using either its OS API or SNMP adapter. If you use
home-grown scripts, all you need to do is to add an API call at the end.
If your back-end ticketing system is Remedy, the out-of-box 2-way
integration (IBRSD) between BPPM/BEM cell and Remedy is more efficient
than Remedy gateways for other monitoring tools. It is fairly straight
forward to configure two instances of IBRSD as active/active failover,
so your chance of waking up at 3am to fight fire is very slim. Since the
license of IBRSD is included in the price of BPPM/BEM, you instantly
cut down the cost when you stop paying for the Remedy gateway license
for other monitoring tools.
Other added benefits include reduced maintenance effort for other
monitoring software, less customization in Remedy, consistent ticket
information for all monitoring tools, and possible event correlation
between events from different monitoring tools. You will also make your
NOC team's job easier.
I understand that it is not always easy to convince people who work on
other monitoring software to integrate into BPPM/BEM due to
organizational silo and technical complexity. It is important to pick
up the right candidate for the first BPPM/BEM integration. Once the ROI
is obvious, people will become more supportive for BPPM/BEM
integration. In addition, it is also important to set up a consistent
framework for all integration since BMC does not provide a standard for
integration. Once you have set up a consistent framework for one-way
and two-way integration, your next integration will become much easier.
At one of my past clients, it took our BPPM/BEM team three months to
work with the other team to finish our first integration because the
integration project had the lowest priority with the other team. Once
everyone saw how well the integration worked and how much license fee it
saved, our second integration took only 4 weeks to finish.
Subsequently our third integration took only three days to finish.
Part 4: Monitor the monitors
The purpose of BPPM is to monitor your IT infrastructure. It is
important that the monitors themselves are up and running all the time.
A good BPPM implementation not just monitors your IT infrastructure, it
also monitors each and every BPPM component including BPPM server, BPPM
agent, BPPM cell, PATROL agent, PATROL adapter service/process, SNMP
adapter service/process, IIWS service/process, IBRSD service/process,
..., etc. The self-monitoring metrics include component status and
connection status.
The events alerting that a BPPM component down or a BPPM connection down
are mostly sent to its connected BPPM cell automatically. Some of the
self-monitoring events require quick activation. You need to identify
those events as they have different event classes and message formats.
And you need to notify the right people about those events.
Some components may have multiple ways to be monitored and you just need
to pick up one way that works the best in your environment. For
example, when a PATROL agent lost its connection with PATROL Integration
Service, you can see an event directly sent from PATROL agent, another
event from PATROL LOG KM if you configured it to monitor IS connection
down log entry, and yet a third event from PATROL Integration Service if
you activated it in BPPM GUI.
You may need to reword the message of a self-monitoring event for better
readability as some messages are not clear at all. For example, by
default, PATROL agent connection down event contains the following
slots:
cell='PatrolAgent@server1@172.118.2.12:3181';
msg='Monitored Cell is no longer responding';
You may want to reword the message to look like this:
msg='PatrolAgent@server1@172.118.2.12:3181 is no longer responding';
because it is the PATROL agent that is no longer responding, not the cell.
For the notification method, the most reliable way is local email fired
from the cell that receives the self-monitoring events. Since your path
to the ticketing system may be down when your BPPM components are
experiencing problems, your back-end ticking system should not be the
only way to send notification for your self-monitoring alerts. It
should be used in addition to your local email notification.
Part 5: Customize at the right place
Unless you are a very small business, you will need to customize BMC
out-of-box solutions to address the particular issues in your IT
environment. It is unrealistic to expect a one-size-fits-all solution
from BMC. Fortunately BPPM was developed with customization in mind. It
provides extensive tools to help you develop your own solutions that
seamlessly extend BMC out-of-box solutions.
BPPM suite has three major components: BMC ProactiveNet, BPPM Cell
(BEM), and PATROL. Both BPPM Cell and PATROL are more than 10 years old.
One of the primary reasons that they are still going strong today is
because they both allow you to add your own solutions to them
seamlessly.
Before you start developing your own custom solutions, take a step back
to think about what options you have and where you should place your
customization. What would be the impact on accessibility and resource
consumption on the underline servers? What would be the impact on
deployment of your custom solutions? What would be the impact on future
maintenance and upgrade?
In PATROL, you can develop custom knowledge modules and you can also
plug in your own PSL code as a recovery action into a parameter. In
BPPM Cell, you can develop your own event classes, MRL code, dynamic
tables, and action scripts to extend the out-of-box knowledge base.
In general, if you have a choice between customizing PATROL and
customizing BPPM Cell to manage events, customizing BPPM Cell would
require less effort and result in less impact to the servers that are
being monitored. Here are a few reasons:
1) PATROL is running on the servers you don't own, have limited access,
and may not be familiar with. For example, I was recently helping a
client debug a custom KM running on AS400. I had to get help from AS400
sysadmin just to add one line in its PSL code.
2) PATROL is often sharing the server with mission critical
applications. Poorly written PSL code could potentially impact the
mission critical applications negatively.
3) The same custom knowledge module may need to be running on more than
one server, thus requiring more time to deploy and upgrade.
4) BPPM Cell is running on your own infrastructure server. It is
infinitely scalable as a peer-to-peer architecture. If resource has ever
become an issue, you can add more cells either on the same server or on
a different server (even with different operating system). you can
split a cell horizontally by processing phases, or you can split a cell
vertically by event sources.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Buyer's Guide
Download our free BMC TrueSight Operations Management Report and get advice and tips from experienced pros
sharing their opinions.
Updated: November 2024
Product Categories
IT Infrastructure Monitoring Application Performance Monitoring (APM) and Observability Event Monitoring Cloud Monitoring Software AIOpsPopular Comparisons
Elastic Observability
SolarWinds NPM
PRTG Network Monitor
ServiceNow IT Operations Management
Auvik Network Management (ANM)
Cisco Intersight
VMware Aria Operations for Applications
Buyer's Guide
Download our free BMC TrueSight Operations Management Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links
Learn More: Questions:
- What are the limitations of BPPM 9.5 server monitoring tools?
- Comparison of BMC Truesight OM with MS System Center OM and IBM Tivoli Monitoring
- BMC TrueSight Intelligence [EOL] vs BMC TrueSight Operations Management: integration with Operations Management Systems and cost
- Any experience with Event & Incident Analytic engines like Moogsoft?
- Windows 10 - what are your main concerns about upgrading?
- When evaluating IT Infrastructure Monitoring, what aspect do you think is the most important to look for?
- What advice would you give to others looking into implementing a mid-market monitoring solution?
- Zabbix vs. Groundwork vs. other IT Infrastructure Monitoring tools
- Anyone switching from SolarWinds NPM? What is a good alternative and why?
- What is the best tool for SQL monitoring in a large enterprise?
Hi Wila,
Great blog. Many thanks...!!