Managing Serviceguard Nineteenth Edition, Reprinted June 2011

Monitoring LAN Interfaces and Detecting Failure: Link Level
At regular intervals, determined by the NETWORK_POLLING_INTERVAL (see “Cluster Configuration
Parameters (page 105)) Serviceguard polls all the network interface cards specified in the cluster
configuration file. Network failures are detected within each single node in the following manner.
One interface on the node is assigned to be the poller. The poller will poll the other primary and
standby interfaces in the same bridged net on that node to see whether they are still healthy.
Normally, the poller is a standby interface; if there are no standby interfaces in a bridged net, the
primary interface is assigned the polling task. (Bridged nets are explained under “Redundant
Network Components ” (page 27) in Chapter 2.)
The polling interface sends LAN packets to all other interfaces in the node that are on the same
bridged net and receives packets back from them.
Whenever a LAN driver reports an error, Serviceguard immediately declares that the card is bad
and performs a local switch, if applicable. For example, when the card fails to send, Serviceguard
will immediately receive an error notification and it will mark the card as down. See “Reporting
Link-Level and IP-Level Failures” (page 73).
Serviceguard Network Manager also looks at the numerical counts of packets sent and received
on an interface to determine if a card is having a problem. There are two ways Serviceguard can
handle the counts of packets sent and received. In the cluster configuration file, choose one of the
following values for the NETWORK_FAILURE_DETECTION parameter:
NOTE: For a full discussion, see the white paper Serviceguard Network Manager: Inbound
Failure Detection Enhancement at http://www.hp.com/go/hpux-serviceguard-docs.
INOUT: When both the inbound and outbound counts stop incrementing for a certain amount
of time, Serviceguard will declare the card as bad. (Serviceguard calculates the time depending
on the type of LAN card.) Serviceguard will not declare the card as bad if only the inbound
or only the outbound count stops incrementing. Both must stop. This is the default.
INONLY_OR_INOUT: This option will also declare the card as bad if both inbound and
outbound counts stop incrementing. However, it will also declare it as bad if only the inbound
count stops.
This option is not suitable for all environments. Before choosing it, be sure these conditions
are met:
All bridged nets in the cluster should have more than two interfaces each.
Each primary interface should have at least one standby interface, and it should be
connected to a standby switch.
The primary switch should be directly connected to its standby.
There should be no single point of failure anywhere on all bridged nets.
NOTE: You can change the value of the NETWORK_FAILURE_DETECTION parameter while the
cluster is up and running.
Local Switching
A local network switch involves the detection of a local network interface failure and a failover to
the local backup LAN card (also known as the standby LAN card). The backup LAN card must not
have any IP addresses configured.
In the case of local network switch, TCP/IP connections are not lost for Ethernet, but IEEE 802.3
connections will be lost. For IPv4, Ethernet uses the ARP protocol, and HP-UX sends out an unsolicited
ARP to notify remote systems of address mapping between MAC (link level) addresses and IP level
addresses. IEEE 802.3 does not have the rearp function.
66 Understanding Serviceguard Software Components