Installation guide

9.5. Clust er Services Hang
When the cluster services attempt to fence a node, the cluster services stop until the fence operation
has successfully completed. Therefore, if your cluster-controlled storage or services hang and the
cluster nodes show different views of cluster membership or if your cluster hangs when you try to
fence a node and you need to reboot nodes to recover, check for the following conditions:
The cluster may have attempted to fence a node and the fence operation may have failed.
Look through the /var/lo g /messag es file on all nodes and see if there are any failed fence
messages. If so, then reboot the nodes in the cluster and configure fencing correctly.
Verify that a network partition did not occur, as described in Section 9.8, “ Each Node in a Two-
Node Cluster Reports Second Node Down . and verify that communication between nodes is still
possible and that the network is up.
If nodes leave the cluster the remaining nodes may be inquorate. The cluster needs to be quorate
to operate. If nodes are removed such that the cluster is no longer quorate then services and
storage will hang. Either adjust the expected votes or return the required amount of nodes to the
cluster.
Note
You can fence a node manually with the fence_no d e command or with Co n g a. For
information, see the fence_no d e man page and Section 4.3.2, “ Causing a Node to Leave or
Join a Cluster.
9.6. Clust er Service Will Not St art
If a cluster-controlled service will not start, check for the following conditions.
There may be a syntax error in the service configuration in the cl uster. co nf file. You can use
the rg _test command to validate the syntax in your configuration. If there are any configuration
or syntax faults, the rg _test will inform you what the problem is.
$ rg_test test /etc/cl uster/cl uster. co nf start servi ce servicename
For more information on the rg _test command, see Section C.5, “ Debugging and Testing
Services and Resource Ordering .
If the configuration is valid, then increase the resource group manager's logging and then read
the messages logs to determine what is causing the service start to fail. You can increase the log
level by adding the l o g l evel = "7" parameter to the rm tag in the cl uster.co nf file. You will
then get increased verbosity in your messages logs with regards to starting, stopping, and
migrating clustered services.
9.7. Clust er-Cont rolled Services Fails t o Migrat e
If a cluster-controlled service fails to migrate to another node but the service will start on some
specific node, check for the following conditions.
Red Hat Ent erprise Linux 6 Clust er Administ rat ion
154