Serviceguard NFS Toolkit A.11.31.02, A.11.11.06, and A.11.23.05 Administrator's Guide

NOTE: The file name of theNFS_FLM_SCRIPT script must be limited to 13 characters

or fewer.

NOTE: Thenfs.mon script uses rpcinfo calls to check the status of various processes.

If the rpcbind process is not running, the rpcinfo calls time out after 75 seconds.

Because 10 rpcinfocalls are attempted before failover, it takes approximately 12

minutes to detect the failure. This problem has been fixed in release versions 11.11.04

and 11.23.03.

The default NFS control script, hanfs.sh, does not invoke the monitor script. You do

not have to run the NFS monitor script to use Serviceguard NFS. If the NFS package

configuration file specifies AUTO_RUN YES and LOCAL_LAN_FAILOVER YES (the

defaults), the package switches to the next adoptive node or to a standby network

interface in the event of a node or network failure. However, if one of the NFS services

goes down while the node and network remain up, you need the NFS monitor script

to detect the problem and to switch the package to an adoptive node.

Whenever the monitor script detects an event, it logs the event. Each NFS package has

its own log file. This log file is named according to the NFS control script, nfs.cntl,

by adding a .log extension. For example, if your control script is called

/etc/cmcluster/nfs/nfs1.cntl, the log file is

called/etc/cmcluster/nfs/nfs1.cntl.log.

TIP: You can specify the number of retry attempts for all these processes in the

nfs.mon file.

On the Client Side

The client should NFS-mount a file system using the package name in the mount

command. The package name is associated with the package’s relocatable IP address.

On client systems, be sure to use a hard mount and set the proper retry values for the

mount. Alternatively, set the proper timeout for automounter. The timeout should be

greater than the total end-to-end recovery time for the Serviceguard NFS package—that

is, running fsck, mounting file systems, and exporting file systems on the new node.

(With journalled file systems, this time should be between one and two minutes.) Setting

the timeout to a value greater than the recovery time allows clients to reconnect to the

file system after it returns to the cluster on the new node.

NOTE: AutoFS mounts may fail when mounting file systems exported by an HA-NFS

package soon after that package has been restarted.To avoid these mount failures,

AutoFS clients should wait at least 60 seconds after an HA-NFS package has started

before mounting file systems exported from that package.

How the Control and Monitor Scripts Work 21