Optimizing Failover Time in a Serviceguard Environment, June 2007
Requirements for SGeFF
Every SGeFF cluster must meet these configuration requirements:
• Serviceguard A.11.16 and SGeFF must be installed on every node in the cluster.
• The cluster must have no more than two nodes.*
• The cluster must have a quorum server cluster lock installed outside the cluster.
• More than one heartbeat network must be configured.
• An RS-232 link is not allowed for a heartbeat network.
• VERITAS CVM and VERITAS CFS cannot be used in an SGeFF cluster.
Note that in an SGeFF cluster, the lock acquisition time is set to 1 x NODE_TIMEOUT.
If Fibre Channel storage is used with SGeFF, it is recommended that one or two switches be
configured between nodes and storage. This reduces the time I/O might spend in the fabric, which
becomes critical if a node takes over an application from a failed node that was accessing a Fibre
Channel disk.
A SGeFF cluster cannot expand online to a non-SGeFF cluster of more than two nodes. SGeFF can
only be enabled or disabled when the cluster is halted.
If SGeFF is used with extended-distance clusters, use the “two data centers and third location”
configuration described in “Designing Disaster Tolerant High Availability Clusters”, available from
www.docs.hp.com/hpux/ha –> Metrocluster or Continentalcluster.
Environments suitable to SGeFF
Systems that have moderate and predictable loads are candidates for SGeFF. Suitable systems should
be free from frequent or large spikes in CPU, network, or I/O activity that would affect the normal
operation of Serviceguard.
It is important to remember that SGeFF reduces only the Serviceguard component of failover time;
application-dependent failover time is unaffected. As mentioned previously in “How you can optimize
failover time”, it is important to reduce both the Serviceguard component of failover time and the
application-dependent failover time. If it takes a long time (say 5 minutes) to start up the application
and complete recovery, use of SGeFF does not reduce total failover time significantly.
SGeFF is well-suited for:
• Applications that have no or very low startup time
• Applications that have very rapid recovery time
• Applications that have fewer and/or quick recoverable resources (for example, packages with
fewer volume groups, fewer file systems, VxFS for faster recovery)
Here is an example of an environment well-suited for SGeFF:
• An SGeRAC environment where
– The Oracle RAC instances are already running on all of the nodes, so there is no application
startup time
– The Oracle RAC can be tuned for rapid recovery, often in 10–60 seconds
• An application with conventional database software tuned to minimize recovery time
• Any IP-based application with low recovery time
* SGeFF clusters must have two nodes. Configuring a one-node cluster is allowed and may be useful to begin configuration; however, no failover
is possible with only one node.
14