Technical Account Manager at a tech services company with 51-200 employees

Dec 26, 2013

BPPM - Movement of Oracle Database from a RAC to a Standalone.

Hi Everyone,

I would like to share one of my experience in a Project where we have a Scenario of moving our BPPM Oracle RAC Database Instance to a Oracle Standalone Database Instance. We just have backup of our RAC Database kept with all data in it.

We have restored the DB Backup from RAC to Standalone Oracle DB instance.

Further to this we have made changes in pronet.conf file over BPPM Application Server.

We need to replace the enteries of Oracle RAC with Oracle Standalone Instance. While doing these changes keep BPPM Application Server completely down.

RAC Instance Enteries which needs to be removed.

pronet.api.database.oracle.rac=true
pronet.api.database.oracle.rac.count=1
pronet.api.database.oracle.rac.host.1=abc.xxx.com
pronet.api.database.oracle.rac.port.1=1655
pronet.api.database.sid=ABC

Standalone Enteries which needs to be placed.

pronet.api.database.hostname=abc.xyz.com
pronet.api.database.oracle.rac=false
pronet.api.database.portnum=1655
pronet.api.database.sid=ABC

Once this is done. Turn on BPPM Application.

Your BPPM will be up and running over Oracle Standalone Database.

./Anuparn Padalia

Disclosure: My company does not have a business relationship with this vendor other than being a customer.

it_user7950

Consultant at a tech consulting company with 51-200 employees

Nov 20, 2013

BPPM has the potential to be a market beating product however, the investment required is significant

This article is a review of BMC ProactiveNet Performance Manager (BPPM) version 8.6 and its key sub-components.

The main key sub-components include:

> ProactiveNet Analytics

> ProactiveNet Event Management (formerly Mastercell)

> ProactiveNet Performance Manager (i.e. PATROL)

Versions Reviewed

Component	Version
BPPM Event Manager	8.6
BPPM Analytics	8.6
PATROL Central	7.8.10
PATROL Central Operator – Web Edition	7.8.10
PATROL Agent	3.9.00.1i
PATROL for UNIX Servers	9.10.00.02

Key Capabilities

Event Management

BPPM Event Management (previously known as Mastercell or BEM) is the component that replaces PATROL Enterprise Manager or PEM (previously known as CommandPost).

BPPM introduces a programming language called MRL. MRL is not as flexible as PERL or REX which can both be used in PEM, but MRL does include many in-built features such as policies that make the design of rules slightly easier.

PEM used to perform event management using up to 5 transformers or scripts written in PERL. PEM was effectively a tool box whereby all the intelligence is provided by the PERL scripts which enrich the events using a number of lookup files.

Which product is better, PEM or BPPM? BPPM is arguable a better event management platform. Although MRL is frustrating to work with, the in-built capabilities mean that you don’t have to develop everything from scratch. BPPM is generally a good event management platform.

Threshold Management

PATROL Configuration Manager (PCM) is one of the best threshold management tools in the industry. The threshold management capabilities on BPPM (aka ProactiveNet) are poor in comparison. BMC state that they will include PCM functionality on the next release of BPPM.

The limitations of Threshold management in BPPM are numerous:

BPPM has no local thresholds that can be applied across multiple servers.
Local thresholds can only be defined via the GUI.
Local thresholds can’t be migrated from one environment to another.
Migration of global thresholds can be performed using a export/import utility – but it is not simple.
The GUI for managing thresholds is cumbersome and not intuitive.

On the plus side, the different types of thresholds in BPPM are very powerful. BPPM has Absolute, Intelligent, Signature and Predictive thresholds. These thresholds are statistically based and will generate events when a statistical anomaly is detected. The product will automatically calculate trends using linear regression and variations based upon hourly, daily or weekly patterns. However, the statistics will not eliminate threshold management as BMC have sometimes claimed. Many thresholds are Boolean in nature – either good or bad - and are therefore not approriate for statistical analysis. Statistical analysis is only appropriate for about 20% to 30% of thresholds and analysis consumes a lot for CPU cycles.

Ease of Implementation

BPPM is undeniably a complex product. Far too complex in my opinion. There are many other much simpler solutions such as HP SiteScope or CA Nimsoft which can be implemented much faster. In addition, the BMC Product Set has gradually got more and more complex over the years. The solution is really three products bundled together:

MasterCell which BMC purchased about 7 years ago.
ProactiveNet which BMC purchased about 4 years ago.
PATROL which BMC purchased about 20 years ago.

MasterCell is a great event management product. ProactiveNet has perhaps been oversold by BMC – and the value is overstated. The autonomous thresholds can only be applied to 20% -to 30% of parameters anyway. PATROL was originally a great product – but has become bloated and complex after years of poor product management.

As an illustration of how complex the BPPM solution has become, consider the following table:

Component / Feature	Old Solution with PEM	New BPPM Solution (version 8.6)
Number of Servers	3 (DEV, DR and PROD)	11 (3 DEV, 3 TEST, 5 PROD)
Number of Connections to the Agents	2 (PEM and RT Server)	3 (BIIP3, BPPM Adaptor, RT Server)
Number of Adaptors	1 – RT Server	3 (RT Server, BPPM Adaptor, BIIP3
Dynamic Policy Files (for Rules)	5 Rule Files	12 Rule Files
Forms for Threshold Management	1 PCM	2 (TEST and PROD BPPM Servers)

Extensibility

The PATROL agent has always been very extensible. There is a rich API and many different ways to write an interface. PATROL Central has no API and therefore can not be extended. Both BPPM and PEM are very extensible and can be extended through a variety of scripting languages such as PhP or PERL.

Blackout

BMC has never provided a web form that allows staff in the Operations Bridge to blackout servers or services for upcoming outages due to planned maintenance. This customer (mentioned in this review) had to write its own Web GUI for Blackout. This is an Apache and PhP solution that allows the shift operators to configure blackouts. It required 25 days of development to alter the blackout web form and migrate this functionality from PEM to BPPM.

Administration

Routine Daily Admin Tasks

For an environment of 500 Agents, BPPM requires from 0.5 to 1 FTE to keep the lights on - depending on the experience of the person. Typical daily tasks include the following:

Restarting Agents. For an environment of 500 Agents, you can expect that 1 agent will crash per day. The most common cause is probably history file corruption. History files can grow to beyond 4 GB if not managed.
Checking the Consoles. Most environments will end up with a hierarchy of BPPM Event cells. The Administrator needs to log into each Console to verify that events are being:
- De-duplicated properly;
- Propagated correctly from one cell to the next;
- that incidents are being raised correctly - if Automtic Incident Generaion (AIG) is configured.
Managing Thresholds. The Administrator will get on average one request per day to change a threshold or verify that a threshold is in place. For example, an ORACLE DBA may say that there was a SEV2 incident last night related to table locking. "Could you please check that instance DW_PROD is monitorited for locking.?" It can take from 30 minutes to 2 hours to investigate each request and write an email suggesting and agreeing the new threshold. Perhaps longer if a meeting is required.
Managing Rules. Changes to the BPPM Rules occur about once per month and need to be performed using change control. Rule changes require a code change to the MRL and the cells will need to be bounced.
Commissioning and Decommission New Agents. Agent commissioning using occurs every few months and may involve up to 20 virtual hosts associated with one Physical machine. The Commissioning process is faily involved (in fact all the Admin steps are complex). See below.
Deploying KMs. When the support teams deploy new infrastructure software such as Websphere or ORACLE, the associated PATROL Knoweldge Module (KM) will also need to be deployed. Each deployment may take 1-3 hours and will require change control. Input will be required from the SME. For example, the ORACLE DBA may be required to type in the system password for ORACLE during the KM Configuration process.

PATROL Agent Commissioning

The Agent commissioning process for configuring monitoring for a new server consists of the steps shown below:

Step Number	Step	Description
1	Ping Host	Ping Host to very that the hostname is correct?
2	Install Agent	Install Agent Using Solaris Package
3	Update Event Rules	edit BPPM enrichment file abc_host.csv
4	Apply to PROD Cell	import abc_host.csv into PROD cell
5	Apply to TEST Cell	import abc_host.csv into TEST cell
6	Update PING Test (primary)	Update PING Test configuration on Primary Server to ensure the host is up.
7	Update PING Test (secondary)	Update PING Test configuration on Secondary Server to ensure the host is up.
8	Configure UNIX km	Use PCM to give Agent Standard Configuration for the UNIX km.
9	Update BIIP3	Update BIIP3 Config so that the Agent can talk to the Event Management Cell.
10	Agent Restart	Restart the Agent to ensure that the Agent Configuration takes affect.
11	Update PCO Web Console	Update PCO Web Console so that the Agent appears in the PATROL console.
12	Update Work request	Update the Work request to indicate the job is complete.

If additional Monitoring is required for ORACLE or WEBLOGIC or some other Application, then there are additional configuration steps that are required.

Programming Languages

There are two languages to learn with BPPM

MRL or Mastercell Rule Language - This is a fairly unique programming language.
PSL or PATROL Script Language. This language is similar to PERL. The complexity lies in the functions that need ot be learned.

Summary of Administration

Administration of BPPM is overly complex. The product has evolved over the course of the last 20 years. As another new component has been added via aquisition, the product has become increasingly complex and time consuming to administer.

Architectural Considerations

Any Solution Design for BPPM should consider the following key questions:

Question	Details
How does the design allow for rule tracing?	Using the trace log is not practical due to the volume of events. A good solution is to assign a Unique ID to each rule and then configure each rule to add an entry to a new slot called “matching_rules”.
How does the design specify rule execution order?	It is often difficult to design rules because of confusion about rule execution order. It is good practice to split all mrl files into mrl files for new rules and mrl files for refine rules. So you get: new_mcxp.mrl and refine_mcxp.mrl. The files then should be grouped in the .load file by stage, so you have refine rules followed by new rules … etc.
Does the DEV environment have the same number of cells as the TEST and OAT environments?	Don’t be tempted to have fewer cells in the DEV environment. It is tempting to have fewer cells in order to limit the number of zones (servers) required. This is a mistake. Rule execution order is greatly affected by the propagation (or not) of slots between cells and the configuration of mcell.propagate.
Does the design specify the configuration of mcell.propagate?	The design should specify the configuration of all mcell config files – including mcell.propagate, mcell.dir etc.
Is BIIP3 included in the Design?	BIIP3 is essential in order to forward PATROL events to the cells for any cells that are not event class 11 and 39. These events are explicitly generated by the PSL event_trigger() function. It is impossible for BPPM Analystics (ProactiveNet) to collect these events because they have no associated metric.
Threshold Management	If thresholds are being migrated fro PCM to BPPM, How will the thresholds be migrated from BPPM server to another? Has the export / import process been thoroughly tested? (because is has serious issues). I would advise migrating the thresholds to BPPM as a Phase II activity or wait for BPPM v9.
Export Thresholds from PCM	Does the design specify using a tool for extracting all the thresholds from PCM into a spreadsheet? (I have a PERL tool to do this).
Testing	Does the Design provide for at least a month of end-to-end testing once the rules have been completed.
Monitoring the Monitoring	Does the Design incorporate monitoring of the monitoring? Will an event be generated if the BIIP3 Adapter fails?
Event Storm	If the BIIP3 Adaptor looses connection to multiple agents every half an hour and then regains the connection 30 seconds later this will create 200 new AGENT_DOWN events (mc_adapter_control). The de-dup rule will not work because the AGENT_UP event closes the AGENT_DOWN event. What rule is going to prevent this event storm?
Time-out Policies	Does the Design specify timeout policies for all the main top level event classes such as MC_CELL.. and EVENT. Does the cell start reasonably quickly with 2000 events? What about 20,000 events?
DDE Enrichment	Does the Design fully specify the Enrichment files that will be used?
DDE Synchronization	Are the DDE config files pulled or pushed into the cells? How are the DDE cfg files synchronized between cells?
Blackout	Has a Web site been included in the Design for Blackout by the Operations Bridge? BPPM does have a “Schedule downtime” facility – but this is entirely inappropriate for operators and does not account for BIIP3 events.
Blackout Dev	If a blackout GUI is a requirement, has a month of Development been allocated (using something like Apache and PhP)?
BPPM Analytics	Does the Design discuss the possibility of implementing BPPM Analytics as a second phase?
Reporting	Does the design include Event Reporting to drive Continuous Improvement? Key reports are total events grouped by: ·Day, Week, Month ·Object Class ·Application ·Service ·Support Group
Reporting DEV	If reporting is a requirement, does the Design include time to implement the BMC reporting tool or 2 weeks of development using PhP and mquery.
AIG	Does the Design Include Automatic Incident Generation? (AIG). Semi-automatic incident generation an option – whereby an operator creates a ticket by right clicking on an event. Is this option considered and discussed in the design?
Failover	Is failover considered? How is the configuration replicated? Replicated DISK?
Training	Doe the project plan include time for Training the staff in the operations Bridge? What about 2nd level support?
Go-live	Is the Go-Live big bang or Phased? Phased is preferred for risk mitigation but will require operators to run two consoles in parallel.
Audible Alarm	Is an Audible alarm a requirement? If so, then this will require a few days of development to configure a web page that uses a sound file and “mquery –s COUNT”.

BPPM Classes

BPPM Has a number of event classes as shown below which all inherit from the CORE_EVENT class.

CORE_EVENT

EVENT
- MC_CELL_EVENT
- MC_UPDATE_EVENT
- MC_SMC_ROOT
- MC_MCCS
- MC_CLIENT_BASE
  - MC_CLIENT_CONTROL
  - MC_CLIENT_ERROR
- MC_ADAPTOR_BASE
  - MC_ADAPTER_CONTROL
  - WIN_EVENTLOG
  - LOGFILE_BASE
  - SNMP_TRAP
- PEM_EV
- PATROL_EV
- PPM_EV
  - ALARM
MC_CELL_CONTROL
- MC_CELL_START
- MC_CELL_STOP
- MC_CELL_TICK
- MC_CELL_STATBLD_START
- MC_CELL_STATBLD_STOP
- MC_CELL_DB_CLEANUP
- MC_CELL_CONNECT
- MC_CELL_CLIENT
- MC_CELL_DESTINATION_UNREACHABLE
- MC_CELL_HEARTBEAT_EVT
- MC_CELL_RESOURCES
- MC_CELL_ACTION_RESULT
- MC_CELL_PUBLISH_RESULT
IAS_EVENT
- IAS_START
- IAS_STOP
- IAS_SYNCH_EVENT
- IAS_REINIT
- IAS_LOGIN
- IAS_ERROR

Mastercell Rule Language (MRL)

Mastercell Rule Language (or MRL) is the language used to develop event management rules within BPPM. The administrator can develop 11 different types of rules as shown in the table in section "Rule Phases" below. The language is simple and relatively easy to learn in terms of both the syntax and the in-built functions. The most difficult concept to grasp is the execution order as explained below. One of the most common problems with the rules is to misunderstand the execution order and find that the rules are not executing in the desired sequence. The other cause of frustration is the lack of common statements such as a looping structures (do, while for until) which one takes for granted in other languages. It is possible to iterate over a list structure using the listwalk() function call. The New rule phase also has limited capability to loop over events using the Updates clause. Fortunately however, the need to loop is fairly rare. However, at times the lack of standard statements can be a cause of frustration.

The biggest problem with MRL is the slow cycling speed when debugging code. Compared to PhP or PERL, it takes at ten times as long, to stop, compile and restart. So debugging cycles are 10 times as long and productivity is similarly affected. True, it is not necessary to write pages and pages of code - but typically one will write about 8-15 pages of MRL for each project. 8 pages of PhP (tested and debugging) takes 1 to 2 days. 8 pages of MRL (tested and debugged) takes 2-4 weeks. In addition, one should allow for an additional month of End-to-End testing before production go-live to test the rules with real events - and to allow for all possible scenarios to play out and for all the bugs to emerge. This rules of thumb apply for companies of 5,000 to 10,000 employees. For larger organizations, you should allow for more time.

Execution Order

Rules are processing in order according to their rule phase as shown below.
Rules are executed in the order in which they appear in the .load file.
Rules are executed in the order in which they appear in the mrl file.
Policies are executed in order of the specified ‘execution order”.

Rule Phases

Rules are executed in the order shown below.

Execution Order	Rule Phase	Description
1	Refine	A Refine rule verifies the validity of incoming events and collects additional data for an event before it is sent through the remaining rule phases where further processing takes place.
2	Filter	Filter rules limit the number of incoming events by discarding those events that need no additional processing or analysis. Filter rules compare incoming events to the event condition formulas (ECFs) contained in the rule to determine if an event is discarded or proceeds to further processing. An incoming event is processed through each Filter rule until a Filter rule discards the event, or all Filter rules are exhausted. An event must match all the Filter rules to be accepted.
3	Regulate	Use regulate rules to handle time frequency accumulations of events or repetitive occurrences of events. An event is considered a repetition of another if the event has the same values for all the slots that are defined with the dup_detect=yes facet in the BAROC definition of its event class.
4	New	Use New rules to execute an action when a new event is received, for example increasing the severity level for an event or updating an existing event with new event data. New rules determine if an event becomes permanent and is placed in the repository.
5	Abstract	Abstract rules create high-level, or abstract, events based on low-level events. A new event starts at the new rules phase, skipping the filter and regulate rules phases. With Abstract rules, you can keep low-level events with cells in the lower-level of the cell hierarchy, abstract the data from low-level events into high-level events, and propagate them to a higher-level cell. A high-level cell in the hierarchy can consolidate abstract events from several low-level cells and prevent a large number of abstracted technical events for which no consolidating rules apply.
6	Correlate	Correlate rules build an effect-to-cause relationship between an event that occurs as a result of another event. Correlate rules execute whenever a cause or an effect event is received. The relationship between correlated events can be broken.
7	Execute	The Execute rule performs a specified action when a slot value has changed in the repository. The specified action, which is either internal to the cell or running an external executable, is based on the characteristics of one or more events.
8	Threshold	The Threshold rule counts the number of events that matches the criteria you specify if the number of these events exceeds the amount allowed within a time frame the Threshold rule executes. An event is considered a repetition of another if the event has the same values for all the slots that are defined with the dup_detect=yes facet in the BAROC definition of its event class.
9	Propagate	A cell uses Propagate rules to forward events or messages to one or more destination cells or gateways. For example, a Propagate rule can escalate an event from a lower level cell to a higher-level cell in an environment.
10	Timer	Use Timer rules to create timed triggers to call a rule. Timer rules are evaluated when a timer expires.
11	Delete	The purpose of Delete rules is to perform actions before an event is discarded from the repository, such as a rule that suppresses data that has no meaning without an event instance. Delete rules are evaluated whenever an event is deleted from the repository or when events are deleted using the Delete flag in the mposter command.

PATROL Configuration Manager (PCM)

PATROL Configuration Manager (PCM) is a configuration tool used for PATROL agents. The tool is mainly used for configuring Thresholds and is very effective at this task.

Operation

PCM is similar in concept to the Windows registry editor. The Main Form consists of a two TreeView panes as shown below. The left TreeView is used to configure hosts which are arranged in groups such as ORACLE (shown below). The right hand TreeView is used to manage the rules which can also be arranged into groups. The RuleSets are linked to the Hosts by dragging RuleSets from right to left. The RuleSets are dragged and dropped onto the leaves marked "LinkedRuleSets". The user then invokes a command called "Apply RuleSets". The Rulesets are applied to each Agent in the same order as they appear in the hierarchy on the left. RuleSets linked to lower level nodes take precedence and "override" higher level group RuleSets.

PCM

Typical Use Case

The use of PCM typically follows a three step process. Administrators must perform the following:

Select an Agent as a master and configure this Agent using the PATROL Central Operator (PCO) Console.
Copy the configuration into PCM.
Apply the configuration to other similar Agents using PCM.
Restart the Agents in order for the configuration to take affect.

Weakness

The key weaknesses of this configuration process are the following:

PCM and PCO are seperate tools. Ideally, the configuration tool (PCO) and the configuration distribution tool (PCM) should be the same product. This would eliminate step 2 above.
Step 4 should not be necessary. Restarted the Agents can be easily performed using PCM - but the problem is that all active events are regenerated. This means that all agents must be blacked out for up to an hour before any restart - otherwise staff in the Operations Bridge will see hundreds of duplicate events that they have already handled over the last few hours.

Desired State Management

The key benefit of PCM is that it can be used to manage a Desired State for each Agent If you apply the configuration once or a thousand times, the result is exactly the same. The Hierarchy allows one to set global or default configuration using the higher nodes in the left TreeView an then to override the configuration with local (host specific) configuration using the lower nodes. This hierarchy works extremely well.

Policies

The Policies feature within BPPM Event Management is gnerally a well executed feature within the product and has suffcient flexibity to meet most customer's needs. The Dynamic Data Enrichment (DDE) policies allows the user to manage the rules externally using Comma Seperated Value (CSV) files.

The key thing that must be kept in mind, is that the DDE policies match based on Best Fit and not First Match. So for example, if you want to match on a hostname called "fred*" (the star is a wild card) then frederick will match before fred* even if fred* appears first in the csv file. The rules are loaded into a hash memory structure within the product. The benefit of 'Best Fit" is that the execution time for finding a match is predictable - irrespective of the number of lines in the CSV file (and there could be thousands). The disadvantage of "Best Fit" is that the matching can be out of sequence and counter-intuitive. Best Practice in this case is to keep the CSV files simple. Each Enrichment file should also have only one purpose. For example, the customer used in this review orignally started with 5 enrichment files with their old PATROL Enteprise manager (PEM) environment. After implementing BPPM, the customer ended up with 11 DDE enrichment files. The number of total lines was less, but the number of files was more.

When migrating from PEM to BPPM, the enrichment files should be "Normalized" - by minimizing the number of lookup columns in order to reduce the probability of out-of-order rule matching.

BMC Standard Policies

Policy	Description
Closure	An closure policy closes a specified event when a separate specified event is received.
Blackout Policy	A blackout policy might be used during a maintenance window or holiday period
Component Based Enrichment	enriches the definition of an event associated with a component by assigning selected component slot definitions to the event slots
Enrichment	enriches the definition of an event associated with a component by assigning selected component slot definitions to the event slots
Correlation	Correlation relates one or more cause events to an effect event, and can close the effect event The cell maintains the association between these cause-and-effect events.
Escalation	Escalation raises or lowers the priority level of an event after a specified period of time. A specified number of event recurrences can also trigger escalation of an event. For example, if the abnormally high temperature of a storage device goes unchecked for 10 minutes or if a cell receives more than five high-temperature warning events in 25 minutes, an escalation event management policy might increase the priority level of the event to critical.
Notification	Notification sends a request to an external service to notify a user or group of users of the event. A notification event management policy might notify a system administrator by means of a pager about the imminent unavailability of mission-critical piece of storage hardware.
Propagation	Propagation forwards events to other cells or to integrations to other products.
Recurrence	Recurrence combines duplicate events into one event that maintains a counter of the number of duplicates.
Remote	Remote action automatically calls a specified action rule provided the incoming event satisfies the remote execution policy’s event criteria.
Suppression	Suppression specifies which events that the receiving cell should delete. Unlike a blackout event management policy, the suppression event management policy maintains no record of the deleted event.
Threshold	Threshold specifies a minimum number of duplicate events that must occur within a specific period of time before the cell accepts the event. For events allowed to pass through to the cell, the event severity can be escalated or de-escalated a relative number of levels or set to a specific level. If the event occurrence rate falls below a specified level, the cell can take action against the event, such as changing the event to closed or acknowledged status.
Timeout	Timeout changes an event status to closed after a specified period of time elapses
Component Based Blackout	Specifies which events the receiving cell should classify as unimportant and therefore not process . The events are logged for reporting purposes. A Component Based Blackout event management policy might specify that the cell ignore events generated from a component or device based on component selection criteria for this policy.

Typical DDE Enrichment Files

CSV File Name	Description	Lookup Columns	Data Columns
Host.csv	Assign Location and HostType (DEV, TEST or PROD) based on host name	HostName	Location, Physical Server, HostType
HostSuppress.csv	Filter out events based on hostname (e.g. when new Agent installed)	HostName	HostSuppress (YES,NO)
Application.csv	Assign an application nane to each event.	ApplicationClass, Parameter	Application
ObjectSuppress.csv	Filter out troublesome parameters based on Event class	ApplicationClass, Parameter, EventClass	ObjectSuppress (YES,NO)
ApplicationSupress.csv	Filter out events based on application	Application	ApplicationSuppress (YES,NO)
HostBlackout.csv	Blackout Hosts for planned outages based on timeframe	HostName, PhysicalServer, Location	TimeFrame
Service.csv	Assign Service Name to all events	Host, Instance, HostType	Service, SupportGroup
ServiceSuppress.csv	Filter Out events based on service	Service	ServiceSuppress (YES,NO)
ServiceBlackout.csv	Blackout services for planned outages during a particular time frame	Service	TimeFrame
ServiceDowngrade.csv	Downgrade severity for particular services	Service	SeverityCode (e.g. 12333)
TextMessage	Change message Text for certain parameters	ApplicationName, Parameter, EventClass	NewMesaage

Note: Severitycode of 12333 downgrades MAJOR (4) and CRITICAL (5) to MINOR (3).

Issues

PATROL Agent Restart

If the PATROL agent’s configuration is changed, then the agent usually requires a restart. Unfortunately, the PATROL Agent regenerates all active events (any parameter that exceeds a threshold) when the agent is restarted. This means that all an agent must be blacked out when the Agent is restarted.

PATROL Agent History Corruption

The Agent History file will always get corrupted if the History file exceeds 4 Gbytes. There is a 4 GB file size limit on Solaris. The history file will frequently exceed this limit on busy servers running messaging services such as Tuxedo or MQ (simply because there is a lot to monitor). The history file may get corrupted for other reasons. When the Agent gets corrupted, it will generated an event for every attempt to store a parameter value. This problem can generate hundreds of events every few minutes from just one host. This number events can easily overload a cell and a BIIP3 Adaptor (see BIIP3 Corruption below).

With 500 UNIX Agents, you should expect one agent to get corrupt history about every 2 weeks.

BIIP3 Cache File Corruption

If the BIIP3 cache file is corrupted, the BIIP3 can get stuck on one event and keep generating the event. I have seen 4 million repeated events in a cell due to this problem.

BIIP3 Cache file corruption may be caused by overloaded (see PATROL Agent History Corruption above).

I have seen this problem occur twice within 3 months.

The workaround is to clear the ache file and restart the BIIP3 Adaptor.

BIIP3 Agent Connection Drops

In certain situations, the BIIP3 Adaptor may loose connection with all the agents every half an hour. The Agent will then gain connection again almost immediately. This causes a flapping AGENT_DOWN and AGENT_UP condition that is not de-duplicated – because the AGENT_UP clears the AGENT_DOWN event. This issue can generate thousands of events and thousands of new Incidents (assuming Automatic Incident Generation is implemented).

One best workaround is to create a new rule for MC_ADAPTER_CONTROL (AGENT_DOWN) events and set them initially to severity INFO. If the Agent is truly down then the second agent down event (which occurs 3 minutes later) should be configured in the rule to set the severity back to WARNING or ALARM.

The problem is also solved by restarted the BIIP3 Adapter. I therefore suggest that all customers schedule a restart of the BIIP3 adaptors once per day. No events are lost because the BIIP3 adapter (and the PATROL Agent) caches all events.

I have seen this problem about once per month with a population of 500 agents.

BPPM Threshold Migration

The migration of both global and local thresholds from one BPPM Analystics instance to another must be performed by hand. The is an export / import mechanism for global thresholds, but as of July 2012, this mechanism is unreliable. There is no import / export mechanism for local (host specific) thresholds.

BPPM Local Instance thresholds

BPPM Analytics does not support instance specific thresholds. In other words, you can not set a default threshold for FSCapacity across all file systems and then set an instance specific threshold that applies only to the root FileSystem and htne apply this instance specific threshold to all hosts. The instance specific threshold must be individually defined on all hosts. If there re 500 hosts, this becomes unfeasible. This is no script or API that can be used to automate this task.

BPPM – Missing Hosts

With this release of BPPM, the PATROL Agents are connected to BPPM Analytics using the BPPM Adaptor. When you use the Graphing facility to graph parameters in BPPM, some of the hosts do not appear – event though they are connected via the Adapter. At the time of this writing, this case is open with BMC and is unresolved.

BPPPM does not support Custom Event Catalogues

PATROL Events that are triggered using the event_trigger() PSL function are not supported by BPPM Analytics (ProactiveNet). This forces all customers (who use PATROL agents) to implement both the BIIP3 Adapter (for event_trigger() events) and the BPPM Adapter for all standard PATROL metrics (that have an underlying parameter).

This means that the adapter layer with a BPPM implementation is quite complex. There are three Adapters attached to every agent on three separate ports. The Adapters are the RTServer, the BIIP3 Adapter, and the BPPM Adapter.

This complexity means that the implementation becomes fragile, complex to administer and fundamentally unreliable.

LOG monitoring

It is difficult to define catch-all rules using the standard BMC Log monitoring KM. For example, it is possible to create a catch-all rule that triggers on the search stirng "ALARM". You hten give htis definition a custom origin which might be something like "LOG.BANKING_app_log.alarm". You then create a custom event mesasage that inserts the line from the log file inot the text of the message. This can be done with the syntax "%1-". The problem occurs at the event management layer. All events that match this rule will get rolled up into one event as duplicates - despite the fract that each event represents a different line from the log file and a different problem.

The work-around is to change the de-duplication rules at the event managemnet layer. Be careful. if the rules are improperly defined, you can make the product vulnerable to an event storm - which may only manifest itself a month or two later.

Monitoring of the monitoring is insufficient.

Typical Project

Project Background

The review was conducted after an upgrade Project in which every component within an old PATROL environment was upgraded. The project was driven by the customers internal audit organization that review the companies products and determined that PATROL enterprise Manager (PEM) was no longer supported an therefore the whole environment should be upgraded.

Project Phases

The project consisted of a number of separate projects which could have been undertaken individually. The customer chose to performed all three projects simultaneously which increased the risk, complexity and length of the overall project.

Phase	Description
Phase 1	Solution Design
Phase 2	Upgrade of the PATROL Agents and Knowledge Modules
Phase 3	Replacement of PEM with BPPM Event Manager
Phase 4	Introduction of BPPM Analytics

Project Timescales

The Solution Design phase was conducted in late 2011 and the implementation was started immediately after the New Year in 2012. Phase 3 of the solution was finally put into production on Thursday 28th June 2012.

Phase 4 of the project has not yet been completed. Phase 4 was removed from the project scope when the customer fell behind on delivery. Currently, there are no plans to complete this phase of the project.

The customer contracted several months of consultancy from BMC Software. BMC performed the initial solution Design and much of the initial configuration of the event management rules.

Resources

The resources assigned to the project, consisted of the following:

Resource	Time Allocation
BMC Consultant	~ 3 months
Customer SME	7 Months full time
Independent Consultant	4 Months
Customer UNIX Engineers (2 Engineers)	4 Months
Customer infrastrucutre Architect	1 Month
Customer Project Manager	2 Month
Customer Deliver manager	2 Months
Management Involvement (Project Sponsor + Resource Manager)	1 Month
Total	24 Months

Lessons Learned

The project overran initial estimates – both in terms of budget and cost. The following issues were encountered:

Issue	Description
Solution Design	The Event Management Rules had to be completely redesigned which delayed the projected by about a month. The customer’s old rules used First Match – whereas BPPM only supports Best Fit. The complexity of the customer’s rules was not properly analysed or understood during the design phase.
Documentation	The design of the event management rules and were not properly documented. When it became evident that the design had to be changed, the lack of documentation slowed understanding and meant that some thinking had to be repeated and the design documented properly.
Thresholds	The customer spent over a month trying to migrate their thresholds from PATROL to BPPM. This tasks was complex due to the different format of the thresholds. The customer also experienced many issues with the migration tools which did not work properly. Managing thresholds in BPPM is not as easy as managing thresholds in PATROL (using PATROL Configuration Manager). In the end the customer abandoned the attempt to introduce BPPM analytics. The Autonomous alerts only covered 20% of the thresholds anyway, so the benefit of BPPM Analytics was not compelling.
Testing	The customer underestimated the time required for comprehensive testing. Testing should have been planned earlier, started earlier and resourced appropriately. At least a full month of end-to-end testing was required.
Technical Lead	Technical Leadership was lacking through some parts of the project. Initially, the BMC Consultant was the technical lead. Towards the end, an independent consultant was the technical lead. There were issues of continuity.
Project Phases	The project consisted of 4 project phases. Phase 2 and Phase 4 were optional and were not required in order for the custom to meet its audit deadline. In the end, Phase 4 was abandoned.

Summary and Conclusion

Component Rating (1-5 Stars)

BMC ProcativeNet Performance Manager (BPPM) is really 3 products bundles into one suite. It still makes sense to rate each component individually.

Product	Summary	Score 1-5
BMC BPPM v8.6 Analystics (formerly ProActiveNet)	The product appears to have reasonably good quality control. The graphing is good. The threshold management features are poor - but BMC says this is being fixed in the next release. I am not convinced on the whole concept of using statistics. Statistical analysis uses a lot of CPU which makes scaleability an issue. Only about 30% of monitored metrics are appropriate for statistical analysis. BMC's claims that this product removes the need for threshold management is an exageration and 70% of thresholds will still need to be managed using absolute value (i.e. standard) thresholds.	3
BMC BPPM v8.6 Event Mgmt (formerly Mastercell)	This product is one of the strongest event management products around. There are challenges with using the MRL rule language - but generally this product works well. I question BMC's bundling of this product with ProactiveNet and would like to see the product available as a stand-alone component. Develoing and debugging rules is time consuming and difficult. Only time will tell if this product continuous to be a good event management platform.	3
BMC PATROL 7.8.10	Twenty years ago, PATROL was the best monitoring solution of its type. Since then the product has become bloated and overly complex. PCM was a great addition and makes the management of thresholds realtively easy and repeatable. The product has not changed much in about 8 years. Four years ago, BMC were going to retire the product. Today PATROL is an integral part of BMC's BPPM strategy. The KMs and the breadth of monitoring saves this product from a lower rating.	3

Rating according to Capabilities (Score 1-10)

Component/Capability	Previous Version (with PEM)	Latest Version (BPPM v8.6)
Event Management	3	4
Threshold Management	5	2
Analytics / Graphs	3	5
Ease of Implementation	3	2
Extensibility / interfaces	4	4
Operator Form for Blackout	1	1
Average Score	(3.2)	(3)
Components	PATROL and associated KMs PATROL Central Operator PATROL Enterprise Manager (PEM)	PATROL and associated KMs PATROL Central Operator BPPM Event Management BPPM Analytics (ProactiveNet)

Conclusion

The score for BPPM has not improved with this revision. The product is more complex, more difficult to implement and thresholds are more difficult to administer. The improvement in capability associated with anomaly detection is not convincing and not proven to this customer and is only relevant for 30% of parameters. BMC must work hard to improve administration and ease of implementation.

The combination of BPPM Analytics (ProactiveNet), BPPM Event Management (Mastercell) and PATROL has the potential to be a market beating product. However, the investment required is significant. Time will tell if BMC delivers on this vision.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.

it_user69666Consultant

Report as inappropriate

Nov 20, 2013

I would like to concur with the statement "I question BMC's bundling of this product with ProactiveNet and would like to see the product available as a stand-alone component." Also, regarding MRL tracing, I have had some success using the releatively new tracewrite() function.

Buyer's Guide

BMC TrueSight Operations Management

July 2025

Free Report: BMC TrueSight Operations Management Reviews and More

Learn what your peers think about BMC TrueSight Operations Management. Get advice and tips from experienced pros sharing their opinions. Updated: July 2025.

DOWNLOAD NOW

861,524 professionals have used our research since 2012.

it_user67758

Consultant with 51-200 employees

Nov 6, 2013

TM ART to BPPM Integration – Tips and Tricks

If you’ve been reading this blog, you probably know that BMC ProactiveNet Performance Manager (BPPM) is a centralized event management system that acts as a single-pane-of-glass for many IT Operations teams as well as other functional groups within IT organizations. BPPM attempts to bring together as much information as possible about the health of an IT organization from external tools to get an overall view of the environment.

One good way to measure the overall health of a complex system with many moving parts is by injecting synthetic transactions and measuring their response time. BMC Transaction Management Application Response Time (TM ART) is a tool that does just that. It runs scheduled synthetic transactions from remote locations against business applications and tracks the availability, accuracy and response time of those transactions.

Wouldn’t it be nice to see TM ART measurements in BPPM?

Fortunately, the TM ART integration to BPPM is native to both products – no customization needed. The integration works by way of a data adapter that connects from a BPPM Agent to the TM ART Central Server using HTTP(S). Data is retrieved on a scheduled basis for all of the TM ART projects that are configured and accessible in TM ART. The data is stored in BPPM, so Intelligent thresholds can be defined to trigger events against it, just like any other data source.

Analysis with TrueLog

Besides sending data to BPPM, the TM ART application also runs diagnostics during failures (availability and accuracy only) and captures those as TrueLogs in TM ART. Reviewing the TrueLog can go a long way toward identifying the cause of an event that was generated in BPPM. Typically, events have an associated TrueLog that demonstrates the transaction output errors or discrepancies. Since the TrueLog is such a powerful tool for analyzing transactions, here is how you can incorporate them into your events.

First, you need to take advantage of TM ART’s ability to execute actions in response to errors to generate a TrueLog of the transaction. This action is optional and must be enabled when creating or configuring the monitor. There are three options, as you can see below.

Once the Generate TrueLog option is enabled in TM ART, you can take advantage of the built-in context Diagnostic in BPPM for all TMART Intelligent Events, shown below:

Notice the name of the action called ‘Run Now + TrueLog’; it does exactly what it states. It makes a connection to TMART, Logs into the UI and generates a brand new True Log for the monitor in question. Since this is a manual action, the end user could be creating a new TrueLog at a different time than the event, which may or may not be very helpful. To get a more timely TrueLog from the Diagnostic, you may want to convert it into an Event Rule to run automatically whenever an event arrives from TM ART. From our testing, the automated diagnostic seems to run between 6-12 seconds after the original event in TMART.

Cross Launching TrueLog

If you follow the steps above, in the resulting TMART Execution Log area, you would see two TrueLogs – the one created by TMART and the one created by the BPPM Event Rule a number of seconds later. You may wonder if this duplication is necessary. So did we.

Although it might seem intuitive to turn off the TrueLog creation in TM ART and just enable the manual diagnostic or the Event Rule in BPPM, this will fail because the BPPM actions rely on the Generate TrueLog option. Therefore, the ‘Generate TrueLog’ flag can’t be set to ‘Never’.

A more effective approach is to add a context-sensitive link from the event in BPPM directly to the existing TrueLog in TM ART. This allows you to cross-launch from the event in BPPM to a very specific page in the Projects Execution Logs:

Example: https://< TMART Server >:< port >/bmc/DEF/Monitoring/Monitoring?pId=8&mainTab=4

The key variable in the above URL is the pID number (8 in the example), which you can usually parse from the mc_smc_alias slot using an MRL rule in BPPM. Once that value is known, the whole URL can be placed in the mc_object_uri slot and will automatically become an active hyperlink in any event with that value. The end result is a quick way to launch TM ART from an Event in BPPM and get to the TrueLog for analysis.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.

it_user67758

Consultant with 51-200 employees

Oct 30, 2013

Upgrading BPPM – Is it too late?

Your monitoring tools need to work properly, and to accomplish that, they must be upgraded periodically.

With your mountain of issues, horse-choking responsibilities, and meetings out the wazoo, it’s easy to miss upgrade deadlines. But, you still need to know that the right information about the status of your environment is reaching the right person at the right time, every time. Those upgrades keep your service performance consistent and operating smoothly.

BPPM Upgrades

If you’re using BMC’s ProactiveNet Performance Management (BPPM) software, you know that regular upgrades are necessary to keep this flagship product working effectively. Each version since 7.7 in 2005 has needed a significant upgrade to reach the latest generally available (GA) version which is currently 9.0.

One thing you may not know is that any version before 8.5 can’t be upgraded directly to 9.0. An enterprise using an earlier version must first upgrade to the 8.5 version, and only from that point can it be upgraded to 9.0. Also, the support BMC provides for BPPM is rapidly limited, then expired; staying up to date is the only way to have access to support.

Not even your mom will support you forever…

At this writing, version 8.5 will be unsupported after October 31, 2013. Those organizations still using version 8.5 need to arrange for upgrades before that time.

Moreover, version 8.6 can be upgraded to 9.0, but 8.6 will be changed to limited support this summer on July 31, 2013 … and will be completely unsupported after July 31, 2014.

So what are the options for an enterprise using a version nearing the end of its support?

Basically, there are two.

The first, and less expensive, is to initiate an “over the top” upgrade to 8.5, then another to 9.0. The down side of this approach is the monitoring time lost during the upgrade. Since monitoring is usually needed 24/7, it can be detrimental to go offline for the time needed to do the upgrade.

Each upgrade will result in lapsed monitoring time, for a number of hours.

Because of unknown factors such as the size of the database, the number of devices, rules and reports, as well as the number of thresholds that will be used, it’s difficult to predict the length of time monitoring will be disabled during an upgrade.

More Challenges

There are more issues, too. For instance, upgrading a BPPM Agent/ Integration Service (IS) before the server is upgraded makes the connection between the two obsolete. BPPM components aren’t backward compatible. By the same token, once the server is upgraded, BPPM Agents also have to be upgraded before the system will work. In a large environment, bringing all these components up to speed is even more difficult as well as time consuming.

Then add this to the mix: the extent of customizations can have a huge impact on the upgrading process. Some of the customized files will likely be lost and have to be restored. Best practices may require updates to other related native files, too. Customizations to the knowledge base must be accompanied by careful review and documentation prior to an upgrade.

All of these issues are about updates of BPPM alone. However, in most environments, BPPM is integrated with multiple other applications, such as Patrol, Transaction Management Application Response Time (TM ART), Configuration Management Database (CMDB), Blade, and Remedy. All of the tools in the given environment must operate smoothly together. There are strict version dependencies between each of these products that must align. In some cases, customers may be prevented from upgrading BPPM until CMDB and ITSM have been upgraded to a supported version.

So … the big question: What is the alternative? If upgrading leads to all these complications, how is the enterprise to avoid them?

The answer? A calculated migration.

The Benefits of Migration

A migration includes new hardware, installing the latest BPPM version, testing, then integrating, then slowly migrating system functionality to the new system.

1.) A significant benefit in this approach is the new hardware. The use of new hardware and possibly a new operating system or enhanced gold images create a far better platform for BPPM in the longterm.

There have been a number of enhancements added to BPPM between 8.5 and 8.6, not the least of which was the support for an external Oracle database. Changing from the native Sybase database is not possible during an upgrade, and once the upgrade is complete it isn’t an option to upgrade to Oracle later.

The only way to move to Oracle, if that’s a good decision for your organization, is to perform a complete install on BPPM.

2.) The other benefit of a migration over an upgrade is the monitoring outage discussed before. Monitoring outage with an upgrade can be several hours or more, but with the migration, the outage is usually not more than a few minutes.

During those few minutes it is true that it’s necessary to manage both systems at once, but that’s usually over a short time and your new system is up and running smoothly.

So here’s a list of the pros and cons of each approach:

Upgrade BPPM

Pros:

No new hardware required (therefore less expensive)

Cons:

No database changes allowed
May require multiple upgrades to reach the final version that supports direct upgrade to 9.0 ( each upgrade is costly and time consuming)
Unknown monitoring outage window
Customizations can be lost
Incapability of integrations

Migrate BPPM

Pros:

Controlled upgrade strategy / timeline
New fresh hardware / operating system
Database changes allowed

Cons:

Users / operators will have to watch two consoles until full migration is complete
Incompatibility of integrations

You can see that the list of pros in the migration strategy outweighs the list in the upgrade approach. However, no two environments are identical so decisions need to be made based on the best approach for you, in your environment.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.

VINICIUS CAIXETA

IT Operations Monitoring Specialist at a tech services company with 51-200 employees

May 11, 2024

Download

Provides visibility to our infrastructure, how it is, the resources we are monitoring, and quick updates when it has any problems

Pros and Cons

"The solution provides visibility to our infrastructure, how it is, the resources we are monitoring, and quick updates when it has any problems. We have integrated it with ServiceNow to open instances."

"The dashboards are not good. We have a limited dashboard, and if we want better dashboards, we need to use other solutions like Grafana because the TrueSight dashboards are not good."

What is our primary use case?

We use the solution to monitor a vast infrastructure, including operating systems, maintenance, Windows, services, processes, applications, and databases. Therefore, we have integrations with monitoring products such as Microsoft, Cisco, and SaaS solutions for management.

How has it helped my organization?

The solution provides visibility to our infrastructure, how it is, the resources we are monitoring, and quick updates when it has any problems. We have integrated it with ServiceNow to open instances.

What is most valuable?

The important feature is device management.

What needs improvement?

The dashboards are not good. We have a limited dashboard, and if we want better dashboards, we need to use other solutions like Grafana because the TrueSight dashboards are not good.

TrueSight could add any new resources because everything is changing to BMC Helix and will be discontinued.

Some points didn't evolve. We are still using the node architecture, a node type of agent, and a decent cell, which was created many years ago.

For how long have I used the solution?

I have been using BMC TrueSight Operations Management since 2009. We are using V11.3.05 of the solution.

What do I think about the stability of the solution?

The solution is a very stable solution. After you get everything done, take some time to get the recording done.

I rate the solution’s stability a nine out of ten.

What do I think about the scalability of the solution?

Scalability depends on the number of servers dedicated to the solution and infrastructure management service; you install it from any integration server.

I rate the solution’s scalability a seven out of ten.

How was the initial setup?

The initial setup is not that easy. It can depend on the environment you are deploying. It could be tough to do. It took a few months to get everything ready.

It takes many servers. One server has a central console that will present the data for all the components.

I rate the initial setup a five or six out of ten, where one is difficult and ten is easy.

What about the implementation team?

Deployment was done in-house.

What's my experience with pricing, setup cost, and licensing?

The product is expensive, depending on the types of monitoring you have. You need to acquire more licenses.

I rate the product’s pricing an eight out of ten, where one is cheap, and ten is expensive.

Which other solutions did I evaluate?

We like the features that are presented to us, such as remediation and remote actions. It is possible to customize the agents and consult the graphs from the devices

What other advice do I have?

I advise you to find someone with experience before entering the space.

Overall, I rate the solution a seven out of ten.

Which deployment model are you using for this solution?

On-premises

Disclosure: My company does not have a business relationship with this vendor other than being a customer.

reviewer1418433

Technical Services Team Lead at a tech services company with 51-200 employees

Sep 8, 2021

Download

An end-to-end performance monitoring and event management solution with a useful synthetic monitor

Pros and Cons

"I like the deep-dive detail and end-user metrics data. The synthetic monitor is the best one. The best point of the new one is that there's no need for configuration. You can inject the Java script and start to change major developments in the application. This is a good approach, and we received all the data using this."

"I would like them to improve the deep-dive details, tracing, and data agents in this product. We have EUEM, an end-user experience monitoring appliance. This one's quicker than the current one, and reporting side and filtration side are very bad. There are many details we look at and explain what we receive information in the current one, but we cannot have historical data like we do with EUEM. We cannot have a powerful point to look for specific traffic from a specific application and a specific browser. We don't have it in the new one. The current BMC also needs to add the thing that control versions."

What is most valuable?

I like the deep-dive detail and end-user metrics data. The synthetic monitor is the best one. The best point of the new one is that there's no need for configuration. You can inject the Java script and start to change major developments in the application. This is a good approach, and we received all the data using this.

What needs improvement?

I would like them to improve the deep-dive details, tracing, and data agents in this product. We have EUEM, an end-user experience monitoring appliance. This one's quicker than the current one, and reporting side and filtration side are very bad.

There are many details we look at and explain what we receive information in the current one, but we cannot have historical data like we do with EUEM. We cannot have a powerful point to look for specific traffic from a specific application and a specific browser. We don't have it in the new one. The current BMC also needs to add the thing that control versions.

For how long have I used the solution?

I have been using BMC TrueSight Operations Management for six years.

How are customer service and technical support?

BMC support is very good, and they always find solutions. They can give you a release back or batch it if somebody needs assistance. This is a good thing for BMC support. BMC support is very good, and you can get the full journey if you have the full solution.

BMC TrueSight Operations Management has a module covering all the applications and search, the website hardware utilization, and the traffic and storage. You can get all the details, and it will be better if you have different websites. You have the full journey and no need to bother the tool itself.

How was the initial setup?

The initial setup is straightforward and easy to implement quickly. You can receive all the data, and you can have a lot of dashboards covering all the details like how much traffic, the OS, the federal ID, client ID, and location. If all of this is the same, you can create tables, dashboards, server ID, and client ID.

What other advice do I have?

I would advise potential users to try to get TrueSight Infrastructure Monitoring and synthetic when implementing BMC. Then they will have a very powerful solution. The main point is that it's the manager of managers. It would be best if you highlighted this, and BMC can integrate with all monitoring solutions. They will have only one screen to show all. That is the multiple monitoring application.

On a scale from one to ten, I would give BMC TrueSight Operations Management an eight.

Disclosure: My company has a business relationship with this vendor other than being a customer. partner

reviewer2400420

System Engineer at a tech consulting company with 501-1,000 employees

May 28, 2024

Download

Useful for automation and for event management

Pros and Cons

"The most beneficial part of the product in terms of IT monitoring revolves around the areas involving automation, and it also serves as an end-to-end event management and incident management tool."

"Cost is an issue with BMC TrueSight Operations Management."

What is our primary use case?

I use the solution in my company purely for event management.

What needs improvement?

Cost is an issue with BMC TrueSight Operations Management. Though I am not responsible for the budget, I know that it is an expensive tool set when used only for event management. The tool's issue predominantly revolves around the cost.

My company's complaints regarding the product stem from the fact involving the cost of migration of the tool to BeyondTrust. In our company, we want to look at opportunities and see if there are any alternatives to BMC TrueSight Operations Management.

I wouldn't want anything to be introduced in the product since it has the job when it comes to the area of event management. I can do more with the product's dashboard and graphical features, which are all available in the upgraded version of the solution involving BeyondTrust.

For how long have I used the solution?

I have been using BMC TrueSight Operations Management since 2011. My company is just a customer of the product

What do I think about the stability of the solution?

It is an extremely stable solution.

What do I think about the scalability of the solution?

It is a very scalable solution. The product is extremely scalable and simple to use.

How are customer service and support?

The solution's technical support is good. My company has no complaints about the technical support offered by the solution. I rate the technical support an eight out of ten.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

I have experience with LogicMonitor. In my company, we use BMC for event management and not for monitoring. In my company, we use Intuit for our network monitoring, while we use LogicMonitor for smaller customers and system monitoring.

How was the initial setup?

The product's initial setup phase was easy due to the fact that we had BMC TrueSight products in our company's environment. If we move to BMC Helix Operations Management, the setup phase might not be straightforward because our company will have an autotask feature, along with non-BMC and non-proprietary tool sets in place, owing to which I think the setup process will be difficult.

The solution can be deployed in six months.

What's my experience with pricing, setup cost, and licensing?

Though I have no clue about the tool's actual price, I know that it is astronomical.

Which other solutions did I evaluate?

BMC TrueSight Operations Management will soon come to an end-of-life phase, so our company will have to migrate to BMC Helix Operations Management, but its cost is too much. What I am looking at this time is whether there are any suitable alternatives to choose against BMC TrueSight Operations Management.

What other advice do I have?

The most beneficial part of the product in terms of IT monitoring revolves around the areas involving automation, and it also serves as an end-to-end event management and incident management tool. In our company, the event management part integrates into ITSM. When it comes to open and closed incidents, the tool manages it from end to end.

In our company, we don't use the tool's predictive analytics capabilities. The tool is purely useful for live events, event reduction, and event enrichment, while monitoring tools will do any predictive analytics.

In BMC TrueSight Operations Management, one has end-to-end performance monitoring and ELM. My company uses LogicMonitor, which does not offer event management functionality at the moment. I am looking at integrating all the event tools in my company and all my monitoring tools into one event management solution and then having an ITSM tool that has autotask features.

Though I take care of the full-time maintenance of BMC TrueSight Operations Management, there is an IT team in my company for it.

In case my company plans to migrate to VMware from BMC TrueSight Operations Management, we would have certain AI features and functionalities.

BMC TrueSight Operations Management is a solid platform that users can use at work.

I rate the tool an eight out of ten.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.

reviewer2012658

Technical Consultant at a tech services company with 51-200 employees

Nov 10, 2022

Download

Useful event management, reliable, and simple deployment

Pros and Cons

"The most valuable features of BMC TrueSight Operations Management are the blackouts and event management."

"BMC TrueSight Operations Management could improve the reporting."

What is most valuable?

The most valuable features of BMC TrueSight Operations Management are the blackouts and event management.

What needs improvement?

BMC TrueSight Operations Management could improve the reporting.

For how long have I used the solution?

I have been using BMC TrueSight Operations Management for approximately one year.

What do I think about the stability of the solution?

BMC TrueSight Operations Management is stable.

How are customer service and support?

I have not used the support from BMC TrueSight Operations Management.

How was the initial setup?

If the system meets the prerequisites requirements of BMC TrueSight Operations Management then the process of deployment is not difficult.

What other advice do I have?

The visibility of the environment that the solutions provide is great. The visibility of the entire infrastructure and the ability to drill down to the detail is useful.

I rate BMC TrueSight Operations Management an eight out of ten.

Disclosure: My company has a business relationship with this vendor other than being a customer. Partner