Veritas Storage Foundation 5.1 SP1 Cluster File System Administrator"s Guide (5900-1738, April 2011)

Table 7-1
Fencing startup issues on SFCFS cluster (client cluster) nodes
(continued)
Description and resolutionIssue
Assume the following situations to understand preexisting split-brain in server-based
fencing:
There are three CP servers acting as coordination points. One of the three CP servers
then becomes inaccessible. While in this state, also one client node leaves the cluster.
When the inaccessible CP server restarts, it has a stale registration from the node
which left the SFCFS cluster. In this case, no new nodes can join the cluster. Each node
that attempts to join the cluster gets a list of registrations from the CP server. One CP
server includes an extra registration (of the node which left earlier). This makes the
joiner node conclude that there exists a preexisting split-brain between the joiner node
and the node which is represented by the stale registration.
All the client nodes have crashed simultaneously, due to which fencing keys are not
cleared from the CP servers. Consequently, when the nodes restart, the vxfen
configuration fails reporting preexisting split brain.
These situations are similar to that of preexisting split-brain with coordinator disks, where
the problem is solved by the administrator running the vxfenclearpre command. A
similar solution is required in server-based fencing using the cpsadm command.
Run the cpsadm command to clear a registration on a CP server:
# cpsadm -s cp_server -a unreg_node
-c cluster_name -n nodeid
where cp_server is the virtual IP address or virtual hostname on which the CP server is
listening, cluster_name is the VCS name for the SFCFS cluster, and nodeid specifies the
node id of SFCFS cluster node. Ensure that fencing is not already running on a node before
clearing its registration on the CP server.
After removing all stale registrations, the joiner node will be able to join the cluster.
Preexisting split-brain
Checking the connectivity of CP server
You can test the connectivity of CP server using the cpsadm command.
You must have set the environment variables CPS_USERNAME and
CPS_DOMAINTYPE to run the cpsadm command on the SFCFS cluster (client
cluster) nodes.
Troubleshooting SFCFS
Troubleshooting I/O fencing
188