Arbitration For Data Integrity in Serviceguard Clusters, July 2007
Arbitration for Data Integrity in Serviceguard Clusters
Arbitration in Disaster-Tolerant Clusters
26
with shared data, there is no one disk that is actually connected to both
data centers that could act as a lock disk. Arbitration in this case can be
obtained by using arbitrator nodes or a quorum server.
Arbitrator Nodes
For example, a metropolitan cluster with three nodes in Data Center A
and three nodes in Data Center B could be partitioned such that two
equal-sized groups remain up and running, trying to re-form. To address
this problem, the supported configurations included one or two arbitrator
nodes located in a third data center. These nodes are configured into the
cluster for the purpose of providing a majority of nodes when combined
with one half the nodes in an equal partition. In other words, if the
metropolitan cluster should lose one data center, the surviving data
center would still remain connected to the arbitrator nodes, so the
surviving group would be larger than 50% of the previously running
nodes in the cluster. It could therefore obtain the quorum and re-form
the cluster.
Note that in a metropolitan cluster, it is the simple existence of the
node(s) in the third data center that provides arbitration combined with
the requirement that the configuration have an equal number of nodes in
Data Center A and Data Center B. The arbitrator nodes located in Data
Center C may do useful work, but they are not attached to the storage
devices used by the main nodes in the cluster. They are fully configured
as cluster nodes, but their main job is to provide arbitration.
Quorum Server
With the advent of the quorum server, another MetroCluster
configuration is now possible. A quorum server process, located in a third
data center, can be used for arbitration. The third data center is needed,
as it was in the case of arbitrator nodes, to provide the appropriate
degree of disaster tolerance. That is, the QS could arbitrate cluster
re-formation if either of the other two entire sites should be destroyed.
One advantage of the quorum server is that additional cluster nodes do
not have to be configured for arbitration.