Datasheet

arXiv:hep-ex/0112003v2 5 Dec 2002
1
Redundant Arrays of IDE Drives
D. A. Sanders, Member, IEEE, L. M. Cremaldi, Member, IEEE, V. Eschenburg, C. N. Lawrence, C.
Riley, Member, IEEE, D. J. Summers, D. L. Petravick
Abstract The next generation of high-energy physics
experi ments is expected to gather prodigious amounts of
data. New methods must be developed t o handle this data
and make analysis at universities possible. We exam ine
some techniques that use recent developments in commod-
ity hardware. We test redundant arrays of integrated dr ive
electronics (IDE) disk drives for use in offline high-energy
physics data analysis. IDE redundant array of inexpen-
sive disks (RAID) prices now equal the cost per terabyte
of million-dollar tape robots! The arrays can be scaled to
sizes affordable to institutions without robots and used when
fast random access at low cost is important. We also explore
three methods of moving data between sites; internet trans-
fers, hot pluggable IDE disks in FireWir e cases, and writable
digital video di sks (DVD-R).
Keywords RAID, EIDE, FireWire.
I. Introduction
W
E report tests, using the Linux operating system,
of redundant arrays of integrated dr ive electronics
(IDE) disk drives for use in particle physics Monte Car lo
simulations and data analysis [1], [2]. Parts costs of to-
tal systems using commodity IDE disks are now at the
$4000 per terabyte level. A revolution is in the making.
Disk storage prices have now decre ased to the point where
they equal the cost per terabyte of 300 terabyte Storage
Technology tape silos. The disks, however, offer far better
granularity; even small institutions can a fford to deploy
systems. The faster ra ndom access of disk versus tape is
another major advantage. Our tests include reports on
software redundant arrays of inexpensive disks Level 5
(RAID-5) systems running under Linux 2.4 using Promise
Ultra 100 disk controllers. RAID-5 protects data in c ase of
a catas trophic single disk failure by providing parity bits.
Journaling file systems are used to allow ra pid recover y
from system crashes. We also report on using FireWir e
(IEEE 1394) to PCI (Peripheral Component Interconnect)
interfaces. FireWire PCI cards allow sixty-three dev ic e s
(e.g. a c ombination of computers and disks) per card. The
maximum Firewire bus speed is currently limited to 400
megabits per second. Fir e Wire is also hot pluggable.
Our data analysis strateg y is to encapsulate data and
This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version
will be superseded. Manuscript submi tted to IEEE Transactions On
Nuclear Science, November 25, 2001; revised March 19, 2002. This
work was supported in part by the U.S. Department of Energy under
Grant Nos. DE-FG05-91ER40622 and DE-AC02-76CH03000.
D. A. Sanders, L. M. Cremaldi, V. Eschenburg, C. N. Lawrence,
C. Ri ley, and D. J. Summers are with the University of Mississippi,
Department of Phys ics and Astronomy, University, MS 38677 USA
(telephone: 662-915-5438, e-mail: sanders@phy.olemiss.edu.)
D. L. Petravick is with the Fermi N ational Accelerator Laboratory,
CD-Integrated Systems Development, MS 120, Batavia, IL 60510-
0500 USA (telephone: 630-840-3935, e-mail: petravick@fnal.gov)
Publisher Item Identifier 10.1109/TNS.2002.801699
CPU processing p ower together. Data is stored on many
PCs. Analysis of a particular part of a data set takes place
locally on, or close to, the PC where the data resides. The
network backbone is only us e d to put results together. If
the I/O overhead is moderate and analysis tasks need more
than one loc al CPU to plow through data, then each of
these disk arrays could be used a s a local file server to a
few computers sharing a local ethernet switch. These com-
modity 8-port gigabit ethernet switches would be combined
with a single high end, fast backplane switch allowing the
connection of a thousa nd PCs. We have a lso successfully
tested using Network File System (NFS) s oftware to con-
nect our disk arrays to computers that cannot run Linux
2.4.
We e xamine three ways of moving data b e twe en sites; in-
ternet transfers, hot pluggable IDE disks in FireWire case s,
and writable digital video disks (DVD-R). Writable 4.7 GB
DVD-R disks are now available for $5. They can be read
by $60 DVD-ROM drives and written by the $500 Pioneer
DVR–A0 3 drive [3].
RAID [4] stands for Redundant Array of Inexpensive
Disks. Many industry offerings meet all of the qualifica-
tions exce pt the inexpensive part, severely limiting the size
of an array for a g iven budget. This may change. The
different RAID levels can be defined as follow:
RAID-0: “Striped.” Disks are combined into one physi-
cal device where reads and writes of data are done in par-
allel. Access speed is fast but there is no redundancy.
RAID-1: “Mirrored.” Fully redundant, but the size is
limited to the smallest disk.
RAID-4: “Parity.” For N disks, 1 disk is used as a parity
bit and the rema ining N 1 disks are combined. Protects
against a single disk failure but access speed is slow since
you have to update the parity disk for each write.
RAID-5: “Striped-Parity.” As with RAID-4, the effec-
tive size is that of N 1 disks . However, since the parity
information is also distributed evenly among the N drive s
the bottleneck of having to update the parity disk for each
write is avoided. Protects against a sing le disk failure and
the access speed is fast.
RAID-5, using enhanced integrated drive electronics
(EIDE) disks under Linux s oftware, is now available [5].
Redundant disk arrays do provide protection in the most
likely single disk failure case, that in which a single disk
simply stops working. This removes a major obstacle to
building large arrays of EIDE disks. However, RAID-5
does not totally protect against other types of disk failures.
RAID-5 will offer limited protection in the case where a sin-
gle disk stops working but causes the whole EIDE bus to
fail (or the whole EIDE controller card to fail), but only
temporarily stops them from functioning. This would tem-

Summary of content (8 pages)