Serviceguard Network Manager: Inbound Failure Detection, March 2007

Configuration of network failure detection

The network failure detection method is configurable. Users can choose between the existing default

method of INOUT or the new enhanced method of INONLY_OR_INOUT. Changes to this parameter

can be made online without having to bring the cluster down. The network failure detection setting is

modified in the cluster configuration file by setting the value for the NETWORK_FAILURE_DETECTION

parameter.

NETWORK_FAILURE_DETECTION is a global parameter that affects the behavior of all NICs

configured in the cluster regardless of whether they are primary, standby, heartbeat, or data LANs.

When using INONLY_OR_INOUT, Serviceguard Network Manager checks the inbound and

outbound traffic as before. If both inbound and outbound traffic stop incrementing, it marks the NIC

as down, just as it does with the default method. The significance of the INONLY_OR_INOUT setting

is that when only the inbound value stops incrementing, Serviceguard will start a process to determine

if inbound traffic has actually failed and if a local failover is applicable, using the algorithm described

in “How inbound failure detection works” in the following section.

The feature is supported with Ethernet, token ring, FDDI, and all types of NICs that Serviceguard

supports for local failover—more specifically, types of NICs that support the Data Link Provider

Interface (DLPI).

How inbound failure detection works

Handling a broken cascaded cable

Figure 2 diagrams a two-node cluster. Each node is configured with two NICs for both heartbeat and

data traffic. The primary NICs are connected to a primary switch, the standby NICs are connected to

a standby switch, and the two switches cascade using a crossover cable. These switches connect to

routers and then to clients.

Figure 2. Polling paths in a two-node cluster’s primary LAN data and heartbeat connection