hot plug RAID memory technology for fault tolerance and scalability
hot plug RAID memory technology for fault tolerance and scalability
figure 1: server outages during a one-year period due to memory failures
1
1
10
100
1000
10000
100000
1000000
64 MB 1 GB 16 GB
Memory Capacity
Cumulative failures per 10,000 systems
Parity
ECC
(logarithmic scale)
ECC for large memory
systems is only about as
good as parity checking
is for smaller capacities
Nearly 50%
system failures
per year
120%
75%
4.6%
48%
3%
.3%
hot plug RAID
memory
To help meet the availability and scalability demands of today’s eBusiness world, HP
developed a solution that allows customers to take advantage of industry-standard
memory technology, increase server fault-tolerance, increase memory capacity, and
increase server availability. Hot Plug RAID Memory provides a level of protection far
greater than standard ECC-based solutions and allows the detection of otherwise
undetectable errors (table 1).
table 1: comparison of protection provided by parity checking, ECC, and Hot Plug RAID Memory
Error Condition Parity Standard ECC RAID Memory
Single-bit Detect Correct Correct
Double-bit Unreliable Detect Correct
4-bit DRAM Unreliable Detect Correct
8-bit DRAM Unreliable Unreliable Correct
Greater than DRAM Unreliable Unreliable Detect
For years, the computer industry has used redundant array of independent disk (RAID)
technology to provide fault tolerance and high availability for disk drive subsystems in
servers. The technology used in Hot Plug RAID Memory is conceptually similar to RAID
storage technology. However, in the context of the memory solution, RAID stands for
redundant array of industry-standard DIMMs.
1
Source: Timothy J. Dell, “A White Paper on the Benefits of Chipkill-Correct ECC for PC Server Main Memory,” IBM Microelectronics
Division – Rev. 11/19/97
4









