Veritas Volume Manager 5.0.1 Troubleshooting Guide, HP-UX 11i v3, First Edition, November 2009

synchronization occurs because the status of writes that were outstanding at the

time of the failure cannot be determined.

If a loss of sync occurs while a RAID-5 volume is being accessed, the volume is

described as having stale parity. The parity must then be reconstructed by reading

all the non-parity columns within each stripe, recalculating the parity, and writing

out the parity stripe unit in the stripe. This must be done for every stripe in the

volume, so it can take a long time to complete.

Warning: While the resynchronization of a RAID-5 volume without log plexes is

being performed, any failure of a disk within the volume causes its data to be lost.

Besides the vulnerability to failure, the resynchronization process can tax the

system resources and slow down system operation.

RAID-5 logs reduce the damage that can be caused by system failures, because

they maintain a copy of the data being written at the time of the failure. The

process of resynchronization consists of reading that data and parity from the

logs and writing it to the appropriate areas of the RAID-5 volume. This greatly

reduces the amount of time needed for a resynchronization of data and parity. It

also means that the volume never becomes truly stale. The data and parity for all

stripes in the volume are known at all times, so the failure of a single disk cannot

result in the loss of the data within the volume.

Disk failures

An uncorrectable I/O error occurs when disk failure, cabling or other problems

cause the data on a disk to become unavailable. For a RAID-5 volume, this means

that a subdisk becomes unavailable. The subdisk cannot be used to hold data and

is considered stale and detached. If the underlying disk becomes available or is

replaced, the subdisk is still considered stale and is not used.

If an attempt is made to read data contained on a stale subdisk, the data is

reconstructed from data on all other stripe units in the stripe. This operation is

called a reconstructing-read. This is a more expensive operation than simply

reading the data and can result in degraded read performance. When a RAID-5

volume has stale subdisks, it is considered to be in degraded mode.

A RAID-5 volume in degraded mode can be recognized from the output of the

vxprint -ht command as shown in the following display:

V NAME RVG/VSET/COKSTATE STATE LENGTH READPOL PREFPLEX UTYPE

PL NAME VOLUME KSTATE STATE LENGTH LAYOUT NCOL/WID MODE

SD NAME PLEX DISK DISKOFFS LENGTH [COL/]OFF DEVICE MODE

SV NAME PLEX VOLNAME NVOLLAYR LENGTH [COL/]OFF AM/NM MODE

19Recovering from hardware failure

Failures on RAID-5 volumes