Using NFS as a filesystem type with HP Serviceguard A.11.20 on HP-UX 11i v3, September 2010
4
• So that Serviceguard can ensure that all I/O from a node on which a package has failed is flushed
before the package starts on an adoptive node, all the switches and routers between the NFS server
and client must support a worst-case timeout, after which packets and frames are dropped.
This timeout is known as the Maximum Bridge Transit Delay (MBTD). Switches and routers that do
not support MBTD must not be used in a Serviceguard configuration. This might lead to delayed
packets that could lead to data corruption.
• Networking among the Serviceguard nodes must be configured in such a way that a single failure
in the network does not cause a package failure.
Setting up the NFS server
See the “NFS Services Administrator’s Guide” for instructions on configuring the NFS server and
shares.
Configuring the NFS package and cluster parameters
Configuring the NFS package parameters
• In the modular package configuration file, the new parameter fs_server specifies the name of the
NFS server. The value of this parameter can be either the hostname of the NFS server or its IP
address (both IPv4 and IPv6 addresses are supported).
The NFS server can be configured on a different subnet or in a different domain than the
Serviceguard cluster.
• fs_types specifies the filesystem type. Set this to “NFS” to use this feature.
• fs_mount_opt specifies the mount option. This must include “-o llock” in addition to any other options
you specify. “-o llock” specifies local locking for the NFS filesystem.
• fs_fsck_opt should not be used. If any option is found in fs_fsck_opt for an NFS-imported filesystem,
a warning will be logged and the value will be ignored.
Configuring the cluster parameter CONFIGURED_IO_TIMEOUT_EXTENSION
In a Serviceguard cluster in which NFS-imported filesystems are used, an unlikely but possible
scenario exists in which data corruption could occur. The scenario is as follows:
1. A Serviceguard package using an NFS filesystem (“NFSPkg”) is running on cluster node “client-1”
2. Node “client-1” issues an NFS write request immediately before NFSPkg moves to another cluster
node.
3. NFSPkg is started on the adoptive node “client-2”
4. Adoptive node “client-2” begins sending NFS write requests to the same file and offset as the write
request previously sent by “client-1” just before the package was moved.
5. If the original NFS write request from "client-1" arrives on the NFS server after the new write
requests from “client-2”, the server would overwrite the data sent from “client-2”, thus resulting in
data corruption.