Managing Serviceguard 13th Edition, February 2007

Planning and Documenting an HA Cluster
Cluster Configuration Planning
Chapter 4 161
The maximum recommended value is 30,000,000
microseconds (30 seconds).
Remember that a cluster reformation may result in a
system halt (TOC) on one of the cluster nodes. For
further discussion, see“What Happens when a Node
Times Out” on page 129.
There are more complex cases that require you to make
a trade-off between fewer failovers and faster failovers.
For example, a network event such as a broadcast
storm may cause kernel interrupts to be turned off on
some or all nodes while the packets are being
processed, preventing the nodes from sending and
processing hearbeat messages. This in turn could
prevent the kernel’s safety timer from being reset,
causing the node to halt (TOC). (See “Cluster Daemon:
cmcld” on page 60 for more information about the
safety timer.)
AUTO_START_TIMEOUT
The amount of time a node waits before it stops trying
to join a cluster during automatic cluster startup. In
the ASCII cluster configuration file, this parameter is
AUTO_START_TIMEOUT. All nodes wait this amount of
time for other nodes to begin startup before the cluster
completes the operation. The time should be selected
based on the slowest boot time in the cluster. Enter a
value equal to the boot time of the slowest booting node
minus the boot time of the fastest booting node plus
600 seconds (ten minutes).
Default is 600,000,000 microseconds in the ASCII file
(600 seconds in Serviceguard Manager).
NETWORK_POLLING_INTERVAL
The frequency at which the networks configured for
Serviceguard are checked. In the ASCII cluster
configuration file, this parameter is
NETWORK_POLLING_INTERVAL.
Default is 2,000,000 microseconds in the ASCII file (2
seconds in Serviceguard Manager). Thus every 2
seconds, the network manager polls each network