Arbitration For Data Integrity in Serviceguard Clusters, July 2007

Arbitration for Data Integrity in Serviceguard Clusters

Cluster Membership Concepts

What is arbitration? Why is it necessary? When and how is it carried

out? To answer these questions, it is necessary to explain a number of

clustering concepts that are central to the processes of cluster formation

and re-formation. These concepts are membership, quorum,

split-brain, and tie-breaking.

Membership

A cluster is a networked collection of nodes. The key to success in

controlling the location of applications in the cluster and ensuring there

is no inappropriate duplication is maintaining a well-defined cluster

node list. When the cluster starts up, all the nodes communicate and

build this membership list, a copy of which is in the memory of every

node. The list is validated continuously as the cluster runs; this is done

by means of heartbeat messages that are transmitted among all the

nodes. As nodes enter and leave the cluster, the list is changed in

memory. Changes in membership can result from an operator’s issuing a

command to run or halt a node, or from system events that cause a node

to halt, reboot, or crash. Some of these events are routine, and some may

be unexpected. There are frequent cases in cluster operation when

cluster membership is changing and when the cluster software must

determine which node in the cluster should run an application.

How does the cluster software tell where an application should run? In a

running cluster, when one system cannot communicate with the others

for a significant amount of time, there can be several possible reasons:

1. The node has crashed.

2. The node is experiencing a kernel hang, and processing has stopped.

3. The cluster is partitioned because of a network problem. Either all

the network cards connecting the node to the rest of the cluster have

failed, or all the cables connecting the cards to the network have

failed, or there has been a failure of the network itself.

It is often impossible for the cluster manager software to distinguish (1)

from (2) and (3), and therein lies a problem, because in case (1), it is safe

to restart the application on another node in the cluster, but in (2) and

(3), it is not safe.