Managing Serviceguard 11th Edition, Version A.11.16, Second Printing June 2004

Building an HA Cluster Configuration
Configuring the Cluster
Chapter 5224
NOTE Remember to tune HP-UX kernel parameters on each node to ensure
that they are set high enough for the largest number of packages that
will ever run concurrently on that node.
Modifying Cluster Timing Parameters
The cmquerycl command supplies default cluster timing parameters for
HEARTBEAT_INTERVAL and NODE_TIMEOUT. Changing these parameters
will directly impact the cluster’s reformation and failover times. It is
useful to modify these parameters if the cluster is reforming occasionally
due to heavy system load or heavy network traffic.
The default value of 2 seconds for NODE_TIMEOUT leads to a best case
failover time of 30 seconds. If NODE_TIMEOUT is changed to 10 seconds,
which means that the cluster manager waits 5 times longer to timeout a
node, the failover time is increased by 5, to approximately 150 seconds.
NODE_TIMEOUT must be at least 2*HEARTBEAT_INTERVAL. A good rule of
thumb is to have at least two or three heartbeats within one
NODE_TIMEOUT.
Identifying Serial Heartbeat Connections
If you are using a serial (RS232) line as a heartbeat connection, use the
SERIAL_DEVICE_FILE parameter and enter the device file name that
corresponds to the serial port you are using on each node. Be sure that
the serial cable is securely attached during and after configuration.
Optimization
Serviceguard Extension for Faster Failover (SGeFF) is a separately
purchased product. If it is installed, the configuration file will display the
parameter to enable it.
SGeFF reduces the time it takes Serviceguard to process a failover. It
cannot, however, change the time it takes for packages and applications
to gracefully shut down and restart.
SGeFF has requirements for cluster configuration, as outlined in the
cluster configuration template file.
For more information, see the Serviceguard Extension for Faster
Failover Release Notes posted on http://www.docs.hp.com/hpux/ha.