VERITAS Storage Foundation 4.1 Release Notes

VERITAS Storage Foundation Release Notes
Software Issues
33
Note It is the failure condition in the third type of configuration that triggers the problem. These
failures are rare and are not seen during the normal operation of a healthy SAN. This is not a
time-out. No DMP activity will occur after the 10 minutes has passed. DMP only checks the
elapsed time of the I/O after it is returned by the lower layer. If the elapsed time is greater
than dmp_failed_io_threshold seconds (default 600), the error will be returned to VxVM
without retries. DMP will wait as long as it takes for the I/O to be returned.
If the delay in returning the I/O is caused by a problem in the I/O path to the device rather than the
device itself, DMP will incorrectly return the error to the VxVM layer rather than retrying the I/O
on another path. If the volume is mirrored, VxVM will satisfy the I/O from the other plex, and
detach the plex that failed and prevented the volume from becoming hung.
If the volume is not mirrored, the error will be passed to the File System or application layer. This
can result in the File System marking inodes for deletion when they are still valid. If raw volumes
are in use, the application might believe that the data on the disk is corrupted when it is actually
clean.
To prevent this possibility in situations where mirrored volumes are not used, the threshold should
be tuned to a sufficiently high value that is unlikely to be reached. For example, to change the value
of dmp_failed_io_threshold to 16 hours (57600 seconds), modify the value defined in
/kernel/drv/vxdmp.conf as shown here:
dmp_failed_io_threshold=57600
After changing the value, reboot the system.
In situations in which mirrored volumes are in use, and an application time-out is being hit when
there is still a valid plex with the data, the value of dmp_failed_io_threshold can be tuned
to a smaller value so that the I/O can succeed on the mirror without triggering an application
failure.
Cluster Functionality Issues
If a node leaves the cluster while a plex is being attached to a volume, the volume can remain in the
SYNC state indefinitely. To avoid this, after the plex attach completes, resynchronize the volume
manually with the following command:
# vxvol -f resync volume
[i20448]
RAID-5 Volumes
VxVM does not currently support RAID-5 volumes in cluster-shareable disk groups.