Veritas Volume Manager 5.0 Administrator's Guide (September 2006)

370 Administering hot-relocation
How hot-relocation works
How hot-relocation works
Hot-relocation allows a system to react automatically to I/O failures on redundant
(mirrored or RAID-5) VxVM objects, and to restore redundancy and access to those
objects. VxVM detects I/O failures on objects and relocates the affected subdisks to disks
designated as spare disks or to free space within the disk group. VxVM then reconstructs
the objects that existed before the failure and makes them redundant and accessible again.
When a partial disk failure occurs (that is, a failure affecting only some subdisks on a
disk), redundant data on the failed portion of the disk is relocated. Existing volumes on the
unaffected portions of the disk remain accessible.
Note: Hot-relocation is only performed for redundant (mirrored or RAID-5) subdisks on a
failed disk. Non-redundant subdisks on a failed disk are not relocated, but the system
administrator is notified of their failure.
Hot-relocation is enabled by default and takes effect without the intervention of the system
administrator when a failure occurs.
The hot-relocation daemon, vxrelocd, detects and reacts to VxVM events that signify
the following types of failures:
When vxrelocd detects such a failure, it performs the following steps:
vxrelocd informs the system administrator (and other nominated users) by
electronic mail of the failure and which VxVM objects are affected.
See “Partial disk failure mail messages” on page 372.
See “Complete disk failure mail messages” on page 373.
See “Modifying the behavior of hot-relocation” on page 384.
vxrelocd next determines if any subdisks can be relocated. vxrelocd looks for
suitable space on disks that have been reserved as hot-relocation spares (marked
spare) in the disk group where the failure occurred. It then relocates the subdisks to
use this space.
Disk failure This is normally detected as a result of an I/O failure from a
VxVM object. VxVM attempts to correct the error. If the error
cannot be corrected, VxVM tries to access configuration
information in the private region of the disk. If it cannot access the
private region, it considers the disk failed.
Plex failure This is normally detected as a result of an uncorrectable I/O error
in the plex (which affects subdisks within the plex). For mirrored
volumes, the plex is detached.
RAID-5 subdisk failure This is normally detected as a result of an uncorrectable I/O error.
The subdisk is detached.