Managing Serviceguard Eighteenth Edition, September 2010

NOTE:
For most clusters that use an LVM cluster
lock or lock LUN, a minimum
MEMBER_TIMEOUT of 14 seconds is
appropriate.
For most clusters that use a
MEMBER_TIMEOUT value lower than 14
seconds, a quorum server is more
appropriate than a lock disk or lock LUN.
The cluster will fail if the time it takes to
acquire the disk lock exceeds 0.2 times the
MEMBER_TIMEOUT. This means that if you
use a disk-based quorum device (lock disk
or lock LUN), you must be certain that the
nodes in the cluster, the connection to the
disk, and the disk itself can respond quickly
enough to perform 10 disk writes within 0.2
times the MEMBER_TIMEOUT.
With the lowest supported value of 3 seconds, a
failover time of 4 to 5 seconds can be achieved.
NOTE: The failover estimates provided here
apply to the Serviceguard component of failover;
that is, the package is expected to be up and
running on the adoptive node in this time, but
the application that the package runs may take
more time to start.
Keep the following guidelines in mind when
deciding how to set the value.
Guidelines: You need to decide whether it's more
important for your installation to have fewer (but
slower) cluster re-formations, or faster (but
possibly more frequent) re-formations:
To ensure the fastest cluster re-formations,
use the minimum value applicable to your
cluster. But keep in mind that this setting
will lead to a cluster re-formation, and to the
node being removed from the cluster and
rebooted, if a system hang or network load
160 Planning and Documenting an HA Cluster