Specifications

Communication Status Information
l Failover Recovery. If a resource fails on a system, the LCM notifies LifeKeeper to recover
the resource on a backup system.
In addition to the LifeKeeper services provided by the LCM, inter-system application communication
is possible through a set of shell commands for reliable communication. These commands include
snd_msg, rcv_msg, and can_talk. These commands are described in the LCMI_mailboxes
(1M) manual pages. The LCM runs as a real-time process on the system assuring that critical
communications such as system heartbeat will be transmitted.
Communication Status Information
The communications status information section of the status display lists the servers known to
LifeKeeper and their current state followed by information about each communication path.
The following sample is from the communication status section of a short status display:
MACHINE NETWORK ADDRESSES/DEVICE STATE PRIO
tristan TCP 100.10.100.100/100.10.100.200 ALIVE 1
tristan TTY /dev/ttyS0 ALIVE --
For more information, see the communication status information section of the topics Detailed Status
Display and the Short Status Display.
LifeKeeper Alarming and Recovery
LifeKeeper error detection and notification is based on the event alarming mechanism, sendevent.
The key concept of the sendevent mechanism is that independent applications can register to
receive alarms for critical components. Neither the alarm initiation component nor the receiving
application(s) need to be modified to know the existence of the other applications. Application-
specific errors can trigger LifeKeeper recovery mechanisms via the sendevent facility.
This section discusses topics related to alarming including alarm classes, alarm processing and
alarm directory layout and then provides a processing scenario that demonstrates the alarming
concepts.
Alarm Classes
The /opt/LifeKeeper/events directory lists a set of alarm classes. These classes correspond to
particular sub-components of the system that produces events (for example, filesys). For each alarm
class, subdirectories contain the set of potential alarms (for example, badmount and diskfull). You
can register an application to receive these alarms by placing shell scripts or programs in the
appropriate directories.
LifeKeeper uses a basic alarming notification facility. With this alarming functionality, all applications
registered for an event have their handling programs executed asynchronously by sendevent when
the appropriate alarm occurs. With LifeKeeper present, the sendevent process first determines if the
LifeKeeper resource objects can handle the class and event. If LifeKeeper finds a class/event match,
it executes the appropriate recover scenario.
200User Guide