Managing Serviceguard Eighteenth Edition, September 2010

Heartbeat Messages
Central to the operation of the cluster manager is the sending and receiving of heartbeat
messages among the nodes in the cluster. Each node in the cluster exchanges UDP
heartbeat messages with every other node over each monitored IP network configured
as a heartbeat device. (LAN monitoring is discussed later, in the section “Monitoring
LAN Interfaces and Detecting Failure: Link Level” (page 92).)
If a cluster node does not receive heartbeat messages from all other cluster nodes within
the prescribed time, a cluster re-formation is initiated; see “What Happens when a
Node Times Out” (page 117) . At the end of the re-formation, information about the
new cluster membership is passed to the package coordinator (described further in
this chapter, in “How the Package Manager Works” (page 67)). Failover packages that
were running on nodes that are no longer in the new cluster are transferred to their
adoptive nodes.
If heartbeat and data are sent over the same LAN subnet, data congestion may cause
Serviceguard to miss heartbeats and initiate a cluster re-formation that would not
otherwise have been needed. For this reason, HP recommends that you dedicate a LAN
for the heartbeat as well as configuring heartbeat over the data network.
NOTE: You can no longer run the heartbeat on a serial (RS232) line or an FDDI or
Token Ring network.
Each node sends its heartbeat message at a rate calculated by Serviceguard on the basis
of the value of the MEMBER_TIMEOUT parameter, set in the cluster configuration
file, which you create as a part of cluster configuration.
60 Understanding Serviceguard Software Components