Installation guide

The root cause of fences is always a node losing token, meaning that it lost communication with
the rest of the cluster and stopped returning heartbeat.
Any situation that results in a system not returning heartbeat within the specified token interval
could lead to a fence. By default the token interval is 10 seconds. It can be specified by adding
the desired value (in milliseconds) to the token parameter of the totem tag in the cluster. co nf
file (for example, setting to tem token= "30 0 0 0 " for 30 seconds).
Ensure that the network is sound and working as expected.
Ensure that the interfaces the cluster uses for inter-node communication are not using any
bonding mode other than 0, 1, or 2. (Bonding modes 0 and 2 are supported as of Red Hat
Enterprise Linux 6.4.)
Take measures to determine if the system is "freezing" or kernel panicking. Set up the kd ump
utility and see if you get a core during one of these fences.
Make sure some situation is not arising that you are wrongly attributing to a fence, for example the
quorum disk ejecting a node due to a storage failure or a third party product like Oracle RAC
rebooting a node due to some outside condition. The messages logs are often very helpful in
determining such problems. Whenever fences or node reboots occur it should be standard
practice to inspect the messages logs of all nodes in the cluster from the time the reboot/fence
occurred.
Thoroughly inspect the system for hardware faults that may lead to the system not responding to
heartbeat when expected.
9.13. Debug Logging for Dist ribut ed Lock Manager (DLM) Needs t o be
Enabled
There are two debug options for the Distributed Lock Manager (D LM) that you can enable, if
necessary: DLM kernel debugging, and POSIX lock debugging.
To enable DLM debugging, edit the /etc/cl uster/cl uster. co nf file to add configuration
options to the d l m tag. The l o g _d ebug option enables DLM kernel debugging messages, and the
pl o ck_d ebug option enables POSIX lock debugging messages.
The following example section of a /etc/cl uster/cl uster.co nf file shows the d l m tag that
enables both DLM debug options:
<cluster config_version="42" name="cluster1">
...
<dlm log_debug="1" plock_debug="1"/>
...
</cluster>
After editing the /etc/cl uster/cl uster.co nf file, run the cman_to o l versio n -r command
to propagate the configuration to the rest of the cluster nodes.
Red Hat Ent erprise Linux 6 Clust er Administ rat ion
156