Forcibly Unmounting NFS Filesystems

forcibly unmounting nfs filesystems
preventative steps
13
important note on the “soft” NFS mount option
Use of the “soft” option on NFS filesystems mounted for read/write access can be
dangerous if your applications are not designed to gracefully handle receiving a
timeout error for operations such as read() or write(). With certain applications,
allowing an NFS write() call to return an I/O error can lead to data corruption if the
client application fails to check the return status of its write() calls and mistakenly
assumes that its data has been successfully written to the server when in fact the write()
call timed out. For this reason, the “hard” mount option (default) is recommended
whenever any write() operation will be performed on the mounted NFS filesystem.
If your NFS environment is one where server systems are frequently unavailable or
non-responsive and you know that all of your client-side applications are designed to
properly handle receiving an I/O error in response to a read() or a write() call, the
“soft” mount option may be a viable means of alleviating some of the frustration and
downtime associated with hung filesystems.
use ServiceGuard to
setup a highly
available NFS server
environment
One of the reoccurring themes throughout this paper is the idea that applications
accessing an NFS filesystem that is mounted with the “hard” option (default) will hang
indefinitely if the server becomes unavailable during their I/O attempt. In the example
on pages 7 through 10, a procedure was outlined showing how a secondary NFS server
can “masquerade” as the real NFS server. This temporary server receives the
retransmitted NFS requests from the clients and returns an ESTALE error because it is not
managing the same filesystems exported by the original server.
While this ESTALE error does allow the hanging client applications to unblock and
continue (or in some cases exit), a more desirable outcome would be if the secondary
NFS server could somehow export the same filesystems as the original server and thereby
take over responsibility for these NFS filesystems while the original server is unavailable.
If this were to occur, the client applications would operate normally despite the fact that
their NFS filesystems had migrated between servers, and while they may experience a
temporary interruption while the filesystems actually migrated, they would quickly be able
to seamlessly continue their operations with the new server.
This scenario is available today on HP-UX 11.0 and 11i using HP’s MC/ServiceGuard
product and Highly Available NFS Server component.
MC/ServiceGuard allows you to create high availability clusters of HP 9000 servers. A
high availability computer system allows application services to continue in spite of a
hardware or software failure. Highly available systems protect users from software
failures as well as from failure of a system processing unit (SPU), disk, or local area
network (LAN) component. In the event that one component fails, the redundant
component takes over. Application services (individual HP-UX processes) are grouped
together in packages; in the event of a single service, node, network, or other resource
failure, MC/ServiceGuard can automatically transfer control of the package to another
node within the cluster, allowing services to remain available with minimal interruption.