Arbitration For Data Integrity in Serviceguard Clusters, July 2007

Arbitration for Data Integrity in Serviceguard Clusters

Cluster Membership Concepts

When the cluster is part of a disaster tolerant solution that has nodes

located in more than one data center, loss of communication can easily

happen unless redundant networking is implemented with different

routing for the redundant links.

In all the above cases, the loss of heartbeat communication with other

nodes in the cluster causes the re-formation protocol to be carried out.

This means that nodes attempt to communicate with one another to

rebuild the membership list. In case (1) above, the running nodes choose

a coordinator and re-form the cluster with one less node. But in case (3),

there are two sets of running nodes, and the nodes in each set attempt to

communicate with the other nodes in the same set to rebuild the

membership list. The result is that the two sets of nodes build different

lists for membership in the new cluster. Now, if both sets of nodes were

allowed to re-form the cluster, there would be two instances of the same

cluster running in two locations. In this situation, the same application

could start up in two different places and modify data inappropriately.

This is an example of data corruption.

How does Serviceguard handle cases like the above partitioning of the

cluster? The process is called arbitration. In the Serviceguard user’s

manual, the process is known as tie-breaking, because it is a means to

decide on a definitive cluster membership when different competing

groups of cluster nodes are independently trying to re-form a cluster.

At cluster startup time, nodes join the cluster, and a tally of the cluster

membership is created and maintained in memory on all cluster nodes.

Occasionally, changes in membership occur. For example, when the

administrator halts a node, the node leaves the cluster, and the cluster

membership data in memory is changed accordingly.

When a node crashes, the other nodes become aware of this by the fact

that no cluster heartbeat is received from that node after the expected

interval. Thus, the transmission and receipt of heartbeat messages is

essential for keeping the membership data continuously up-to-date. Why

is this membership data important? In Serviceguard, a basic package,

containing an application and its data, can only be allowed to run on one

node at a time. Therefore, the cluster needs to know what nodes are

running in order to tell whether it is appropriate or not to start a

package, and where the packages should be started. A package should

not be started if it is already running; it should be started on an

alternate node if the primary node is down; and so forth.