Troubleshooting guide

ManualsBrandsBay Networks ManualsComputer equipmentBayRS

101

102

103

104

105

106

107

108

109

110

Chapter 9 Troubleshooting Active Network Management Fail-over in High Availability Applications

Advanced Technical Reference Guide 4.1 • June 2000 103

Problems detected by the VPN/FireWall module should also be reported using the Active Check Device

Interface- for example, if the fwd daemon is running on each module.

How to check the modules status using the chaprob command

The cphaprob command may be used to register or un-register devices, to report problems, print the list of

devices currently registered and the state of each device. Devices are referred to by name (this name also

appears in the logs, so they should be meaningful and not too long (up to 16 characters).

The syntax of this command can be found in the Check Point 200 Administration Guide on page 575.

This interface allows reporting three states via the Interface Active Check Device (see Interface Active Check

Device on page 102) (the Active Check Device): ok (= ACTIVE)init(=INIT) and problem (= DEAD). This

interface does not allow blocking at READY or STANDBY (blocking at these states seems meaningless though

the LB (Load Balancing) configuration device does block at READY).

Each machine constantly reports (in the FWHAP_MY_STATE message) the number of interface which it has

determined to be up (it distinguishes between "inbound" and "outbound" communication). If one machine has

fewer "UP" interfaces than another machine in the cluster, a problem is reported by this machine's interface

active check mechanism. This means that if an interface is disconnected on all machines, no problem is

detected. It should take about 2 seconds to discover an interface problem (it is preferable to lose a few packets

than to fail over unnecessarily).

The interface problem detection mechanism should be able to detect "Uni-directional" problems, for example a

problem on an interface that can send but not receive packets.

VPN Fail-Over

By leveraging VPN-1 state table synchronization, which includes key exchange information, Check Point’s

High Availability maintains IKE based VPN connections in the event of a fail-over.

VPN solutions without IKE fail-over drop all connections in the event of a failure thus forcing users to re-

authenticate and re-establish connections. IKE fail-over delivers a seamless transition that is critical for many

VPN deployments.

Troubleshooting Fail-Over

The High Availability cluster contains one primary module and one or more secondary modules. When the

primary module fails, one of the secondary module becomes Active.

The following tests can be used to check if the failover capability is working properly, and to isolate problems if

it is not. Both HA modes are tested: Primary-up mode, and Active-up mode.

In primary-up mode the machine with the smallest ID should, if it can, be ACTIVE. This means that if the

primary machine goes down (and fails-over to the secondary machine) and then comes back up, the primary

machine will again filter connections (even though the secondary machine is still functioning properly).

In active-up mode the machine that is currently active remains active (even when another machine in the

cluster with a smaller number is OK) until this (active) machine goes down, at which point the stand-by

machine with the smallest number should take over.

Note: See also “Debugging High Availability”, page 106.