Datasheet

2
porarily disable the whole RAID-5 array. If replacing the
bad disk s olves the problem, i.e. the failure did not per-
manently damage data on other disks, then the RAID-5
array would recover normally. Similarly if only the con-
troller card was dama ged then repla cing it would allow the
RAID-5 array to recover normally. However, if mor e than
one disk was damaged, especially if the file or directory
structure information was damaged, the entire RAID-5 ar-
ray would be damaged. T he remaining failur e mode would
be for a disk to be delivering corrupted data. There is no
protection for this inherent to RAID-5; however, a longitu-
dinal parity check on the data, such as a checksum r e c ord
count (CRC), could be built into event headers to flag the
problem. Redundant copies of data that are very hard to
recreate are still needed. RAID-5 does allow one to ig nore
backing up data that is only moderately hard to recreate.
II. Large Disks
In today’s marketplace, the cost per terabyte of disks
with EIDE interfaces is about a third that of disks with
SCSI (Small Computer System Interface), as illustrated in
Fig. 1. The EIDE interface is limited to 2 drives on each
bus and SCSI is limited to 7 (14 with wide SCSI). The only
major drawback of EIDE disks is the limit in the length of
cable connecting the drives to the drive controller. This
limit is nominally 18 inches; however, we have s ucc e ssfully
used 24 inch long cables [6]. Therefore, one is limited to 10
disks per box for an array (or perhaps 20 with a “double
tower”). To get a large RAID array one needs to use larg e
capacity disk drives. There have been some problems with
using large disks, primarily the maximum addressable size.
We have address e d these problems in an e arlier paper [7].
Because of these concerns a nd because we wanted to put
more drives into an array than could be supported by the
motherboard we opted to use PCI disk controller cards.
We tested both Promise Te chnologies ULTRA 66 [8] and
ULTRA 1 00 [9] disk controller cards, which each support
four drives.
Using arrays of disk drives, as shown in Table I, the cost
per terabyte is similar to that of cost of Storage Technol-
ogy tape silos. However, RAID-5 arrays offer a lot better
granularity since they are scalable down to a terabyte. For
example, if you wanted to store 10 TB of data you would
still have to pay a bout $1,000 ,000 for the tape silo but only
$40,000 for a RAID-5 array. Thus, even sma ll institutions
can afford to deploy systems. Therefore, as seen in Fig. 1,
“you can have your cake and eat it too”.
III. RAID Arrays
There exist disk controllers that implement RAID-5 pro-
tocols right in the co ntroller, for example 3ware’s Escalade
7850 [10], [11], which will handle up to eight EIDE drives.
These controllers cost $600 and did not support disk drives
larger than 137 Gigabytes [12]; so we focus e d our attention
on software RAID-5 implementations [5], [13], which we
tested extensively.
EMASS Tape Robot
Data Storage Cake
Speed
(Megabyte/s)
1500
500
40
30
3
EIDE Disk Arrays
Fast SCSI Disks
Media Cost
($/Gigabyte)
$1500
$130
$6
$2
$2
Cache Memory
SDRAM Memory
Fig. 1
Historically the speed and cost of data storage has
increased as one moved from tape to disk to RAM. EIDE
RAID-5 disk arrays add another layer to the data storage
cake. One doesn’t have to worry as much about tape backup
except for data that is very hard to recreate. The chance
of losing data is lower than with plain scratch disks. The
cost of EIDE RAID-5 is close to that of tape robots and
the random access speed of disk is much faster.
A. Hardware
We have examined both Maxtor DiamondMax [14], [15],
[16] and IBM DeskSta r [1 7] hard disks. For RAID-5 the
disk partitions must be all of the same size. The only
trouble we had was when Maxtor changed the capacity
for the 80 GB disk from 81.9 GB to 80 GB. We had to
repartition the 81.9 GB disks to 80 GB (plus a wasted
partition of 1.9 GB). Fortunately this happened to a test
array and not while trying to replace a failed disk in a
working RAID-5 array. Disk manufacturers have recently
decided to define one GB as 1000 MB, rather than 1024
MB. The drives we consider for use with a RAID-5 array are
compared in Table I. In general, the internal I/O speed of
a disk is proportional to its rotational speed and increases
as a function of platter capacity.
When assembling an array we had to worry about the
“spin-up” current draw on the 12V part of the power sup-
ply. With 8 disks in the array (plus the system disk) we
would have exceeded the capacity of the power supply that
came with our tower case , so we decided to add a second
off-the-shelf power supply rather than buying a more ex-
pens ive single supply. By using 2 power supplies we benefit
from under loading the supplies. The benefits include both
a longer lifetime and better cooling since the heat generated
is distributed over 2 supplies, each with their own cooling
fans. We used the hardware shown in Table II for our array
test. Many of the components we chose are generic; many