Choosing the Right Disk Technology in a High Availability Environment DRAFT Version 2.0, August 1996

Technical HPPA Newsletter # 205, August 1, 1994, "LVM Mirrored Disk

Recovery".

Technical HPPA Newsletter # 217, May 23, 1995, "Recovery Cookbook".

DRAFT -- Revision 2.0

August 22, 1996Page 43

performance

backup strategy

total capacity requirements

power source redundancy

total distance

The need for on-line failed disk replacement versus scheduling downtime

Can up to one hour of downtime be

scheduled

to replace a failed disk? If yes,

then on-line failed disk replacement is not a requirement. Of course,

replacement depends on the availability of a spare disk mechanism and the

knowledge of how to do the replacement.

RAID disk arrays support true on-line replacement of failed disk mechanisms if

they are configured in RAID levels 1, 0/1, 3, or 5 only. The application can

continue to run since the master controller or storage processor limits access to

the failed mechanism. Also, the SCSI busses remain connected and properly

terminated due to the design of the disk array.

With LVM mirrored standalone disks or arrays, it is recommended that the

application be halted so that the simpler replacement procedure can be used.

This also ensures that no I/Os are occurring on the bus at the time of the

replacement. Inadvertent disconnection of the SCSI bus might cause OS and/or

data corruption problems if an I/O was attempted while the bus was

disconnected.

A good discussion of the correct procedure for replacing a failed disk mechanism

in an LVM-mirrored environment can be found in PA NEWS # 205. The

Australian Response Center has created a cookbook that can be accessed as

described in PA NEWS # 217.

The need for data redundancy

Data redundancy can be provided in several ways. One must decide first

whether one level of redundancy is sufficient. One-level data redundancy can be

provided with: