VCEM Profile Failover and Profile Moves
Before relying on event initiated failover, you must be very comfortable that the accuracy of
the selected failure detection events will result in acceptable service level ratings, given your
local installation, configurations, workloads, service level objectives and operations policies.
How HP selected the set of installed failover events
The collection of recommended events includes those events HP believes will be consistently
useful to customers. You may customize this collection or add additional collections for your
local installation.
The following are factors and rationale HP used to select the recommended events:
• Exclude events that do not indicate a server hardware failure.
• Exclude events related to components that are usually configured redundantly. (For
example, the HP BladeSystem c7000 Enclosure accommodates redundant power supplies
and fans. Enclosure events for power and fan failures will be reported to HP SIM and
immediate action should be taken to hot-fix failed components. In most data centers timely
replacement can be effected without impacting service delivery.)
• Exclude events related to components on which the system does not depend to deliver its
services. Since Failover does not support configurations with local storage, it is assumed
that system configurations do not depend on local storage and therefore that related
events, if occurring at all, will not impact service delivery capabilities. Also falling into this
category are errors relating to operation of the HP SIM agents on the managed system.
• Exclude events that are not likely to be received by HP SIM because the reporting system is
not healthy enough to transmit. These events include HBA failures and most CPU and NIC
failures.
The set of failover events installed with VCEM 1.10
The following recommended events are pre-configured in VCEM Failover Events collection.
All of these events can be characterized as “pre-failure” events, meaning that a failure has
not yet occurred but should be expected at any time.
Table n. List of recommended, “consistently useful” Failover events
Event
number
Event type Event category
1001 (SNMP) CPU error threshold passed ProLiant System and Environmental Events
6001 (SNMP) Correctable Memory Error Occurred ProLiant System and Environmental Events
6015 (SNMP) Correctable Memory Error Occurred ProLiant System and Environmental Events
6029
(SNMP) Corr Mem Errors Require a Replacement
Memory Module
ProLiant System and Environmental Events
6056 Corrected Memory Errors Replace Memory Module ProLiant System and Environmental Events
See the section below, Configuring HP SIM Action on Events to initiate Failover.
Using HP SIM Action on Events with Failover
HP Systems Insight Manager has the ability to monitor SNMP traps and WBEM events.
These in turn may be used to notify administrators or to do more complex operations such as
triggering executables on the Central Management System or on remote systems. This