Managing Serviceguard Fifteenth Edition, reprinted May 2008

Understanding Serviceguard Software Components
How the Cluster Manager Works
Chapter 372
a single lock disk. Thus, the only recommended usage of the dual cluster
lock is when the single cluster lock cannot be isolated at the time of a
failure from exactly one half of the cluster nodes.
If one of the dual lock disks fails, Serviceguard will detect this when it
carries out periodic checking, and it will write a message to the syslog
file. After the loss of one of the lock disks, the failure of a cluster node
could cause the cluster to go down if the remaining node(s) cannot access
the surviving cluster lock disk.
Use of the Quorum Server as the Cluster Lock
A quorum server can be used in clusters of any size. The quorum server
process runs on a machine outside of the cluster for which it is providing
quorum services. The quorum server listens to connection requests from
the Serviceguard nodes on a known port. The server maintains a special
area in memory for each cluster, and when a node obtains the cluster
lock, this area is marked so that other nodes will recognize the lock as
“taken.
If communications are lost between two equal-sized groups of nodes, the
group that obtains the lock from the Quorum Server will take over the
cluster and the other nodes will perform a system reset. Without a
cluster lock, a failure of either group of nodes will cause the other group,
and therefore the cluster, to halt. Note also that if the quorum server is
not available when its arbitration services are needed, the cluster will
halt.
The operation of the quorum server is shown in Figure 3-3. When there
is a loss of communication between node 1 and node 2, the quorum server
chooses one node (in this example, node 2) to continue running in the
cluster. The other node halts.