Managing Serviceguard Nineteenth Edition, Reprinted June 2011

ManualsBrandsHP ManualsSoftwareHP Serviceguard Software

301

302

303

304

305

306

307

308

309

310

since the applications are still running normally. But at this point, there is no redundant path if

another failover occurs, so the mass storage configuration is vulnerable.

Using Event Monitoring Service

Event Monitoring Service (EMS) allows you to configure monitors of specific devices and system

resources. You can direct alerts to an administrative workstation where operators can be notified

of further action in case of a problem. For example, you could configure a disk monitor to report

when a mirror was lost from a mirrored volume group being used in the cluster.

See the manual Using High Availability Monitors at the address given in the preface to this manual.

Using EMS (Event Monitoring Service) Hardware Monitors

A set of hardware monitors is available for monitoring and reporting on memory, CPU, and many

other system values. Some of these monitors are supplied with specific hardware products.

Hardware Monitors and Persistence Requests

When hardware monitors are disabled using the monconfig tool, associated hardware monitor

persistent requests are removed from the persistence files. When hardware monitoring is re-enabled,

the monitor requests that were initialized using the monconfig tool are re-created.

However, hardware monitor requests created using Serviceguard Manager, or established when

Serviceguard is started, are not re-created. These requests are related to thepsmmon hardware

monitor.

To re-create the persistence monitor requests, halt Serviceguard on the node, and then restart it.

This will re-create the persistence monitor requests.

Using HP ISEE (HP Instant Support Enterprise Edition)

In addition to messages reporting actual device failure, the logs may accumulate messages of

lesser severity which, over time, can indicate that a failure may happen soon. One product that

provides a degree of automation in monitoring is called HP ISEE, which gathers information from

the status queues of a monitored system to see what errors are accumulating. This tool will report

failures and will also predict failures based on statistics for devices that are experiencing specific

non-fatal errors over time. In a Serviceguard cluster, HP ISEE should be run on all nodes.

HP ISEE also reports error conditions directly to an HP Response Center, alerting support personnel

to the potential problem. HP ISEE is available through various support contracts. For more

information, contact your HP representative.

Replacing Disks

The procedure for replacing a faulty disk mechanism depends on the type of disk configuration

you are using. Separate descriptions are provided for replacing an array mechanism and a disk

in a high availability enclosure.

For more information, see the section Replacing a Bad Disk in the Logical Volume Management

volume of the HP-UX System Administrator’s Guide, at http://www.hp.com/go/hpux-core-docs.

Replacing a Faulty Array Mechanism

With any HA disk array configured in RAID 1 or RAID 5, refer to the array’s documentation for

instructions on how to replace a faulty mechanism. After the replacement, the device itself

automatically rebuilds the missing data on the new disk. No LVM or VxVM activity is needed. This

process is known as hot swapping the disk.

310 Troubleshooting Your Cluster