Forcibly Unmounting NFS Filesystems

forcibly unmounting nfs filesystems

preventative steps

important note on the “soft” NFS mount option

Use of the “soft” option on NFS filesystems mounted for read/write access can be

dangerous if your applications are not designed to gracefully handle receiving a

timeout error for operations such as read() or write(). With certain applications,

allowing an NFS write() call to return an I/O error can lead to data corruption if the

client application fails to check the return status of its write() calls and mistakenly

assumes that its data has been successfully written to the server when in fact the write()

call timed out. For this reason, the “hard” mount option (default) is recommended

whenever any write() operation will be performed on the mounted NFS filesystem.

If your NFS environment is one where server systems are frequently unavailable or

non-responsive and you know that all of your client-side applications are designed to

properly handle receiving an I/O error in response to a read() or a write() call, the

“soft” mount option may be a viable means of alleviating some of the frustration and

downtime associated with hung filesystems.

use ServiceGuard to

setup a highly

available NFS server

environment

One of the reoccurring themes throughout this paper is the idea that applications

accessing an NFS filesystem that is mounted with the “hard” option (default) will hang

indefinitely if the server becomes unavailable during their I/O attempt. In the example

on pages 7 through 10, a procedure was outlined showing how a secondary NFS server

can “masquerade” as the real NFS server. This temporary server receives the

retransmitted NFS requests from the clients and returns an ESTALE error because it is not

managing the same filesystems exported by the original server.

While this ESTALE error does allow the hanging client applications to unblock and

continue (or in some cases exit), a more desirable outcome would be if the secondary

NFS server could somehow export the same filesystems as the original server and thereby

take over responsibility for these NFS filesystems while the original server is unavailable.

If this were to occur, the client applications would operate normally despite the fact that

their NFS filesystems had migrated between servers, and while they may experience a

temporary interruption while the filesystems actually migrated, they would quickly be able

to seamlessly continue their operations with the new server.

This scenario is available today on HP-UX 11.0 and 11i using HP’s MC/ServiceGuard

product and Highly Available NFS Server component.

MC/ServiceGuard allows you to create high availability clusters of HP 9000 servers. A

high availability computer system allows application services to continue in spite of a

hardware or software failure. Highly available systems protect users from software

failures as well as from failure of a system processing unit (SPU), disk, or local area

network (LAN) component. In the event that one component fails, the redundant

component takes over. Application services (individual HP-UX processes) are grouped

together in packages; in the event of a single service, node, network, or other resource

failure, MC/ServiceGuard can automatically transfer control of the package to another

node within the cluster, allowing services to remain available with minimal interruption.