HP-UX HB v13.00 Ch-15 - Serviceguard
HP-UX Handbook – Rev 13.00 Page 13 (of 108)
Chapter 15 Serviceguard
October 29, 2013
Serviceguard uses TCP/IP network services for reliable inter-node communication, including the
transmission of heartbeat messages; periodic signals from each functioning node which are
central to the operation of the cluster. TCP/IP services also are used for other types of inter-node
communication. The network hardware should include redundant LAN interfaces on each node
to permit redundant cluster heartbeat paths to increase cluster availability.
Summarization of the key features of a hardware cluster configuration:
Redundant networking connectivity (standby LAN NIC or APA for each business IP).
Redundant networking infrastructure (e.g. one redundant network using two switches and
one dedicated network for inter-node communication).
Redundant mass storage access (more than one SCSI or FC interface).
Redundant mass storage infrastructure for data protection.
(e.g. using hardware features of disk arrays (RAID) or software solutions like
MirrorDisk/UX or Veritas Volume Manager).
Redundant power supply (using UPS, more than one power circuit).
Additionally use Event Monitoring Service (EMS), which lets you monitor and detect
failures that are not directly handled by Serviceguard
Summarization of the key Features of a package configuration:
Each package has exclusive access to system features such as LVM volume groups,
VxVM disk groups, and one or more ‘ relocatable’ IP addresses.
Each package starts a business application and monitor services. Any monitored service
failure triggers package halt and failover.
For more examples of HA Hardware and Software Configurations see Managing Serviceguard:
http://www.hp.com/go/hpux-serviceguard-docs.
Quorum Rules and Cluster Arbitration Device – Split-brain prevention
To insure packages have owners, Serviceguard transmits UDP heartbeat messages amongst all
nodes periodically. If a node fails to transmit a heartbeat in the NODE_TIMEOUT (A.11.18
and earlier) or MEMBER_TIMEOUT (A.11.19 and later) window identified in the cluster
configuration, the cluster will use the quorum rules to reform the cluster and find owners for
packages that were operated by the missing-in-action node. To prevent Serviceguard from
starting a failover package on a second node when all heartbeat traffic fails, a cluster arbitration
device is required in a 2-node cluster and recommended in 3-16 node clusters. The arbitration
function can take the form of a cluster lock VG, lock LUN or quorum server (on a system
outside of the cluster).