Datasheet

4
We have performed a few simple s peed tests. The first
was “hdparm -tT /dev/xxx”. This test simply re ads a 64
MB chunk of data and measures the speed. On a single
drive we saw read/write speeds of a bout 30 MB/s. On the
whole array we saw a drop to 28 MB/s. When we tried writ-
ing a text file using a simple FORTRAN program (we wrote
“All work and no play make Jack a dull boy” 10
8
times),
the sp e e d was 22.34 MB/s
1
While mounted via NFS over
100 Mb/s ethernet the speed was 2.12 MB/s, limited by
both the ethernet speed and communication overhead. In
the past [2], we have been able to get a much higher fraction
of the rated ethernet bandwidth by using the lower level
TCP/IP socket protocol [33] in place of the higher level
NFS protocol. TCP/IP s ockets are more cumbersome to
program, but are much faster.
We also tested wha t actua lly happens when a disk fails
by turning the power off to one disk in our RAID-5 array.
One could continue to read and write files, but in a “de-
graded” mode, that is without the parity safety net. When
a blank disk was added to re place the failed disk, ag ain one
could continue to read and write files in a mode where the
disk ac c ess s peed is reduced while the system r e built the
missing disk as a background job. This speed reduction in
disk a c cess was due to the fact tha t the parity re generation
is a major disk access in its own right. For more details,
see reference [13].
The performance o f Linux IDE software dr ivers is im-
proving. The latest standards [34] include support for
command overlap, READ/WRITE direct memory access
QUEUED commands, scatter/gather data tr ansfers with-
out interventio n of the CPU, and elevator seeks. Com-
mand overlap is a protocol that allows devices that require
extended command time to perform a bus r elease so that
commands may be executed by the other device on the bus.
Command queuing allows the host to issue concurrent com-
mands to the s ame device. Elevator seeks minimize dis k
head movement by optimizing the order of I/O commands.
We did encounter a few problems. We had to modify
“MAKEDEV” to allow for more than eight IDE devices,
that is to allow for disks beyond “ /dev/hdg”. For ver-
sion 2.x one would have to actually modify the script;
however, for version 3.x we just had to modify the file
“/etc/makedev.d/ide”.
Another problem was the 2 GB file size limit. Older op-
erating system and compiler libraries used a 32 bit “long-
integer” for addressing files; ther e fo re, they co uld not nor-
mally addre ss files larger than 2 GB (2
31
). There are
patches to the Linux 2.4 kernel and glibc but there are
still some problems with NFS and not all applications use
these patches.
We have found that the current underlying file systems
(ext2, ext3, reiserfs) do not have a 2 GB file size limit.
The limit for ext2/ext3 is in the petabytes. The 2.4 kernel
series supports large files (64-bit offsets). Current ver sions
of GNU libc support large files. However, by default the
1
Since we originally submitted this pap er we have tested a new
Asus m otherboard (the A7M266 with the AMD 761 North Bridge
chip) and got significant increases in speed for the RAID-5 array.
32-bit offset interface is used. To use 64-bit offsets, C/C++
code must be recompiled with the following as the first line:
#define _FILE_OFFSET_BITS 64
or the code must use the *64 functions (i.e. open becomes
open64, etc.) if they exist. This functionality is not in-
cluded in GNU FORTRAN (g77); however, it should be
possible to write a simple wrapper C program to replace
the OPEN s tatement (perhaps called open64). We have
succeeded in writing files larger than 2 GB using a sim-
ple C program with “ #define
FILE OFFSET BITS 64”
as the first line. This works over NFS version 3 but not
version 2.
While RAID-5 is recoverable for a hardware failure, there
is no protection against a c c idental deletion of files. To ad-
dress this problem we sug gest a simple sc ript to replace the
“rm” command. Rather than deleting files it would move
them to a “/raid/Trash” or better yet a “/raid/.Trash” di-
rectory on the RAID-5 disk array (similar to the “Trash
can” in the Macintosh OS). The system administrator
could later purge them as space is needed us ing an algo-
rithm based on criteria such as file size, file age, and user
quota.
IV. FireWire
FireWire was developed by Apple and is an IEEE stan-
dard (IEEE 1394) defining a high speed serial bus. This bus
is also named “i.Link” by Sony. It is referr e d to as IEEE
1394 or just 1394 in the Linux world [35]. It is a serial
bus similar in principle to the Universal Seria l Bus (USB),
but runs at speeds of up to 400 Mb/s and is intended to
replace the SCSI bus; however, it is not centered around a
PC (i.e. there may be none or multiple PCs on the same
bus). The FireWire bus allows up to sixty-three device s per
chain. Also, because it has a mode of transmission which
guarantees bandwidth, it is used for digital video cameras
and similar devices. In g e neral it is hot swappable.
There are 2 main chipsets supported under Linux.
The supported chipsets are Texas Ins truments PCIL-
ynx/PCILynx2 and OHCI compliant chips (produced by
various companies). FireWire drivers are now included in
RedHat and other distr ibutions and are supported in the
2.4.x kernel (with patches for the 2.2.x kernel). However,
not all drivers are included in a standard installatio n nor is
it a default option when upg rading the kernel. T he driver
for storage devices , such as hard disks (SBP-2) , was not
included in kernels until the 2.4.7 ke rnel. For these reasons,
we are including the basic instructions here.
We got FireWire working on a Linux box by following
the following steps:
1. We used an inexpensive PCI FireWire controller, fo r a
cost of $25. It was an OHCI-1394 card with a VIA con-
troller.
2. The kernel use d was Linux 2.4.12 as releas ed by Linus
Torvalds and Alan Cox’s -a c 3 patch. Alan’s patches can
be downloaded a t http://www.bz2.us.kernel.org/pub/linux
/kernel/peo ple/alan/linux-2.4/. The -a c series is basically