Troubleshooting guide

Chapter 9 Troubleshooting Active Network Management Fail-over in High Availability Applications
Advanced Technical Reference Guide 4.1 June 2000 101
Troubleshooting Fail-over
Fail-over in High Availability Applications
Note: The section on Fail-over in High Availability Applications Applies to: versions: 4.1 SP1
Introduction
As enterprises have become more dependent on the Internet for their core applications, uninterrupted
connectivity has become more crucial to their success. Beginning with VPN-1/FireWall-1 Version 4.1,
encrypted connections are supported in High Availability configurations and can survive failure of a VPN-
1/FireWall-1 gateway.
VPN-1/FireWall-1 High Availability solutions consist of the following key elements:
1. A mechanism for detection of a gateway failure and redirection of the traffic around the failed gateway to a
backup gateway.
2. State synchronization between two gateways, so that the backup gateway is able to continue connections
that were originally handled by the failed gateway.
An important point of a High Availability (HA) firewall solution is ensuring that there is no single point of
failure on the network. The primary objective of a High Availability firewall solution is providing a secure and
available network 100% of the time. When a failure occurs, the redundant component(s) or back up will ensure
a continuous, normal, flow of network traffic.
High-Availability Failure Detection - How it works
Internal communication between High Availability (HA) cluster machines is performed over a special protocol
(FWHAP). This protocol works over UDP, but the VPN-1/FireWall-1 4.1 SP1 (Check Point 2000)
implementation restricts its use to communication between machines on the same physical network. Although
UDP is used, the packets are never processed by the machine IP/UDP modules but processed by the HA module
before entering the machine. This allows the packets to be non-standard (such as having the same IP address
both as source and destination). This is required because the protocol should allow communication between
cluster machines on any interface. This includes interfaces on which cluster machines have the same IP and
physical address.
The protocol uses port 8116 (both as source port and destination port), and is NOT encrypted.
Packets are sent either to a specific machine or as broadcasts.
The ether header of the packets is not standard. The source ether address (byte 6 - 11 in the packet) is not the
ether address of the interface but a special ether address created by the High Availability (HA) module.
When the High Availability (HA) module is started, the cluster machines inform each other of their interface
configuration (this is done over the FWHAP protocol). If conflicts are discovered in the configuration an error
message is reported (to the console on Solaris and to the event viewer on NT) but no action is taken to correct
this misconfiguration.
HA Cluster machine states
Every machine in the cluster reports its own state periodically and tracks the states of other machines (this is
done by sending a broadcast FWHAP_MY_STATE packet (see the fwha.h file) every 0.5 seconds.