VERITAS Volume Manager 3.1 Administrator's Guide

Recovery
Failures and RAID-5 Volumes
Chapter 8 369
Failures and RAID-5 Volumes
NOTE You may need an additional license to use this feature.
Failures are seen in two varieties: system failures and disk failures.A
system failure means that the system has abruptly ceased to operate due
to an operating system panic or power failure. Disk failures imply that
the data on some number of disks has become unavailable due to a
system failure (such as a head crash, electronics failure on disk, or disk
controller failure).
System Failures
RAID-5 volumes are designed to remain available with a minimum of
disk space overhead, if there are disk failures. However, many forms of
RAID-5 can have data loss after a system failure. Data loss occurs
because a system failure causes the data and parity in the RAID-5
volume to become unsynchronized. Loss of sync occurs because the
status of writes that were outstanding at the time of the failure cannot
be determined.
If a loss of sync occurs while a RAID-5 volume is being accessed, the
volume is described as having stale parity. The parity must then be
reconstructed by reading all the nonparity columns within each stripe,
recalculating the parity, and writing out the parity stripe unit in the
stripe. This must be done for every stripe in the volume, so it can take a
long time to complete.
CAUTION While this resynchronization is going on, any failure of a disk within the
array causes the data in the volume to be lost. This only applies to
RAID-5 volumes
without
log plexes.
Besides the vulnerability to failure, the resynchronization process can
tax the system resources and slow down system operation.
RAID-5 logs reduce the damage that can be caused by system failures,
because they maintain a copy of the data being written at the time of the