Serviceguard Network Manager: Inbound Failure Detection, March 2007

Configuration of network failure detection
The network failure detection method is configurable. Users can choose between the existing default
method of INOUT or the new enhanced method of INONLY_OR_INOUT. Changes to this parameter
can be made online without having to bring the cluster down. The network failure detection setting is
modified in the cluster configuration file by setting the value for the NETWORK_FAILURE_DETECTION
parameter.
NETWORK_FAILURE_DETECTION is a global parameter that affects the behavior of all NICs
configured in the cluster regardless of whether they are primary, standby, heartbeat, or data LANs.
When using INONLY_OR_INOUT, Serviceguard Network Manager checks the inbound and
outbound traffic as before. If both inbound and outbound traffic stop incrementing, it marks the NIC
as down, just as it does with the default method. The significance of the INONLY_OR_INOUT setting
is that when only the inbound value stops incrementing, Serviceguard will start a process to determine
if inbound traffic has actually failed and if a local failover is applicable, using the algorithm described
in “How inbound failure detection works” in the following section.
The feature is supported with Ethernet, token ring, FDDI, and all types of NICs that Serviceguard
supports for local failover—more specifically, types of NICs that support the Data Link Provider
Interface (DLPI).
How inbound failure detection works
Handling a broken cascaded cable
Figure 2 diagrams a two-node cluster. Each node is configured with two NICs for both heartbeat and
data traffic. The primary NICs are connected to a primary switch, the standby NICs are connected to
a standby switch, and the two switches cascade using a crossover cable. These switches connect to
routers and then to clients.
Figure 2. Polling paths in a two-node cluster’s primary LAN data and heartbeat connection
4