EMS Hardware Monitors: Overview

If you find that the default monitoring should be customized, you can always return later and add or

modify monitoring requests as needed.

EMS Hardware Monitors: How It Works

Hardware monitors are implemented as special processes (daemons) running on the computer system. The

typical hardware monitoring process works as follows:

1. A hardware event monitor detects abnormal behavior in one of the hardware resources (devices) it is

monitoring.

2. The hardware event monitor creates the appropriate event message, which includes suggested

corrective action, and passes it to the Event Monitoring Service (EMS).

3. EMS sends the event message to the system administrator using the notification method specified in

the monitoring request (for example: email, message to the console, entry in a system log).

4. The system administrator (or Hewlett-Packard service provider) receives the messages, corrects the

problem, and returns the hardware to its normal operating condition.

5. If the Peripheral Status Monitor (PSM) has been properly configured, events are also processed by

the PSM. The PSM changes the device status to DOWN if the event is serious enough. The change

in device status is passed to EMS, which in turn alerts MC/ServiceGuard. The DOWN status will

cause MC/ServiceGuard to failover any package associated with the failed hardware resource.

The Difference Between Hardware Event Monitoring and Hardware Status Monitoring

Hardware event monitoring is the detection of events experienced by a hardware resource. It is the task of the EMS Hardware

Monitors to detect hardware events. Events are temporary in the sense that the monitor detects them but does not remember them.

Of course the event itself may not be temporary - a failed disk will likely remain failed until it is replaced.

Hardware status monitoring is an extension of event monitoring that converts an event to a change in device status. This

conversion, performed by the Peripheral Status Monitor, provides a mechanism for remembering the occurrence of an event by

storing the resultant status. This persistence provides compatibility with applications such as MC/ServiceGuard, which require a

change in device status to manage high availability packages.