Managing Serviceguard 12th Edition, March 2006
Understanding Serviceguard Software Components
How the Network Manager Works
Chapter 3 103
Monitoring LAN Interfaces and Detecting Failure
At regular intervals, Serviceguard polls all the network interface cards
specified in the cluster configuration file. Network failures are detected
within each single node in the following manner. One interface on the
node is assigned to be the poller. The poller will poll the other primary
and standby interfaces in the same bridged net on that node to see
whether they are still healthy. Normally, the poller is a standby
interface; if there are no standby interfaces in a bridged net, the primary
interface is assigned the polling task. (Bridged nets are explained in
“Redundant Network Components” on page 38 in Chapter 2.)
The polling interface sends LAN packets to all other interfaces in the
node that are on the same bridged net and receives packets back from
them.
Whenever a LAN driver reports an error, Serviceguard immediately
declares that the card is bad and performs a local switch, if applicable.
For example, when the card fails to send, Serviceguard will immediately
receive an error notification and it will mark the card as down.
Serviceguard Network Manager also looks at the numerical counts of
packets sent and received on an interface to determine if a card is having
a problem. There are two ways Serviceguard can handle the counts of
packets sent and received. In the cluster configuration file, choose one of
these two values for the NETWORK_FAILURE_DETECTION
parameter:
• INOUT: When both the inbound and outbound counts stop
incrementing for a predetermined amount of time, Serviceguard will
declare the card as bad. Serviceguard will not declare the card as bad
if only the inbound or only the outbound count stops incrementing.
Both must stop. This is the default.
INONLY_OR_INOUT: This option will also declare the card as bad if
both inbound and outbound counts stop incrementing. However, it
will also declare it as bad if only the inbound count stops.
This option is available starting with Serviceguard A.11.16. It is not
suitable for all environments. Before choosing it, be sure these
conditions are met:
— All bridged nets in the cluster should have more than two
interfaces each.