Managing Serviceguard A.11.20, March 2013

CAUTION: This force import procedure should only be used when you are certain the disk is not
currently being accessed by another node. If you force import a disk that is already being accessed
on another node, data corruption can result.
Package Movement Errors
These errors are similar to the system administration errors, but they are caused specifically by
errors in the control script for legacy packages. The best way to prevent these errors is to test your
package control script before putting your high availability application on line.
Adding a set -x statement in the second line of a legacy package control script will cause
additional details to be logged into the package log file, which can give you more information
about where your script may be failing.
Node and Network Failures
These failures cause Serviceguard to transfer control of a package to another node. This is the
normal action of Serviceguard, but you have to be able to recognize when a transfer has taken
place and decide to leave the cluster in its current condition or to restore it to its original condition.
Possible node failures can be caused by the following conditions:
HPMC. This is a High Priority Machine Check, a system panic caused by a hardware error.
TOC
Panics
Hangs
Power failures
In the event of a TOC, a system dump is performed on the failed node and numerous messages
are also displayed on the console.
You can use the following commands to check the status of your network and subnets:
netstat -in - to display LAN status and check to see if the package IP is stacked on the
NIC.
lanscan - to see if the LAN is on the primary interface or has switched to the standby interface.
arp -a - to check the arp tables.
lanadmin - to display, test, and reset the NICs.
Since your cluster is unique, there are no cookbook solutions to all possible problems. But if you
apply these checks and commands and work your way through the log files, you will be successful
in identifying and solving problems.
Troubleshooting the Quorum Server
NOTE: See the HP Serviceguard Quorum Server Version A.04.00 Release Notes for information
about configuring the Quorum Server. Do not proceed without reading the Release Notes for your
version.
Authorization File Problems
The following kind of message in a Serviceguard node’s syslog file or in the output of cmviewcl
-v may indicate an authorization problem:
Access denied to quorum server 192.6.7.4
344 Troubleshooting Your Cluster