Managing Serviceguard 11th Edition, Version A.11.16, Second Printing June 2004

Understanding Serviceguard Software Components
How the Cluster Manager Works
Chapter 362
How the Cluster Manager Works
The cluster manager is used to initialize a cluster, to monitor the
health of the cluster, to recognize node failure if it should occur, and to
regulate the re-formation of the cluster when a node joins or leaves the
cluster. The cluster manager operates as a daemon process that runs on
each node. During cluster startup and re-formation activities, one node is
selected to act as the cluster coordinator. Although all nodes perform
some cluster management functions, the cluster coordinator is the
central point for inter-node communication.
Configuration of the Cluster
The system administrator sets up cluster configuration parameters and
does an initial cluster startup; thereafter, the cluster regulates itself
without manual intervention in normal operation. Configuration
parameters for the cluster include the cluster name and nodes,
networking parameters for the cluster heartbeat, cluster lock
information, and timing parameters (discussed in detail in the
“Planning” chapter). Cluster parameters are entered using Serviceguard
Manager or by editing the cluster ASCII configuration file (details
are given in Chapter 5). The parameters you enter are used to build a
binary configuration file which is propagated to all nodes in the cluster.
This binary cluster configuration file must be the same on all the nodes
in the cluster.
Heartbeat Messages
Central to the operation of the cluster manager is the sending and
receiving of heartbeat messages among the nodes in the cluster. Each
node in the cluster exchanges heartbeat messages with the cluster
coordinator over each monitored TCP/IP network or RS232 serial line
configured as a heartbeat device. (LAN monitoring is further discussed
later in the section “Monitoring LAN Interfaces and Detecting Failure.”)
If a cluster node does not receive heartbeat messages from all other
cluster nodes within the prescribed time, a cluster re-formation is
initiated. At the end of the re-formation, if a new set of nodes form a
cluster, that information is passed to the package coordinator
(described further below, under “How the Package Manager Works”).