3.1.2 MxFS-Linux Administration Guide
Chapter 2: Configure Export Groups 21
Copyright © 1999-2006 PolyServe, Inc. All rights reserved.
How the High-Availability Monitor Works
When configuring the high-availability monitor associated with an
Export Group, you should be aware of the actions taken by the monitor.
During each probe cycle, the monitor performs a series of checks:
1. The monitor first checks basic NFS Server health by issuing a NULL
RPC call to the NFS Server on the local node. If this call fails, the
Export Group is considered to be DOWN.
2. The monitor next checks the general health of the MxFS
high-availability service by checking for critical MxFS processes. If
any of these processes are not running, the Export Group is
considered to be DOWN. (The processes may not be running because
the node is still initializing or shutting down.)
3. The monitor then checks that each exported path in the Export Group
is available and is mounted on the PSFS filesystem. The monitor does
these checks in timed background threads. If any exported path
cannot be verified in one-half of the probe’s timeout value (or five
seconds, whichever is greater), the Export Group is considered to be
DOWN, as the verifying operations are simple and should reasonably
be completed in this time period.
Because the third check takes a linear amount of time based on the
number of exported paths in the Export Group, it is important to set the
probe timeout value to something larger than the default value of 15
seconds if the Export Group contains more than a few exported paths. If
the high-availability monitor cannot obtain reasonable responsiveness
from the system for exported paths, neither can NFS clients. This check
verifies the practical availability of an exported path as well as its literal
availability.
In the third check, the monitor only issues a warning if a path does not
exist (instead of considering the Export Group to be DOWN as it does in
the situations described above). This method avoids problems that can
arise if an exported path is deleted from a shared filesystem. Because the
purpose of the monitor is to determine whether the local node is healthy
enough to host the NFS service, rather than considering the entire Export
Group to be DOWN on a cluster-wide basis, the monitor will warn than it