Managing Serviceguard Fifteenth Edition, reprinted May 2008

Troubleshooting Your Cluster
Solving Problems
Chapter 8436
Troubleshooting Quorum Server
Authorization File Problems
The following kind of message in a Serviceguard node’s syslog file or in
the output of cmviewcl -v may indicate an authorization problem:
Access denied to quorum server 192.6.7.4
The reason may be that you have not updated the authorization file.
Verify that the node is included in the file, and try using /usr/lbin/qs
-update to re-read the quorum server authorization file.
Timeout Problems
The following kinds of message in a Serviceguard node’s syslog file may
indicate timeout problems:
Unable to set client version at quorum server
192.6.7.2:reply timed out
Probe of quorum server 192.6.7.2 timed out
These messages could be an indication of an intermittent network; or the
default quorum server timeout may not be sufficient. You can set the
QS_TIMEOUT_EXTENSION to increase the timeout, or you can increase the
heartbeat or node timeout value.
The following kind of message in a Serviceguard node’s syslog file
indicates that the node did not receive a reply to it's lock request on time.
This could be because of delay in communication between the node and
the qs or between the qs and other nodes in the cluster:
Attempt to get lock /sg/cluser1 unsuccessful. Reason:
request_timedout
Messages
The coordinator node in Serviceguard sometimes sends a request to the
quorum server to set the lock state. (This is different from a request to
obtain the lock in tie-breaking.) If the quorum server’s connection to one
of the cluster nodes has not completed, the request to set may fail with a
two-line message like the following in the quorum server’s log file: