Installation guide

IBM Eserver xSeries 366 Technical Introduction 13
by default.) Because all mirroring activities are handled by the hardware, memory
mirroring is operating system independent.
When memory mirroring is enabled, certain restrictions exist with respect to placement
and size of memory DIMMs and the placement and removal of memory cards. These
topics are discussed in “Memory mirroring” on page 13.
򐂰 Chipkill™ memory
Chipkill is integrated into the XA-64e chipset, so it does not require special Chipkill DIMMs
and is transparent to the operating system. When combining Chipkill with Memory
ProteXion and Active Memory, the x366 provides very high reliability in the memory
subsystem.
When a memory chip failure occurs, Memory ProteXion transparently handles the
rerouting of data around the failed component as described above. However, if a further
failure occurs, the Chipkill component in the memory controller reroutes data. The
memory controller provides memory protection similar in concept to disk array striping with
parity, writing the memory bits across multiple memory chips on the DIMM. The controller
can reconstruct the “missing” bit from the failed chip and continues working as usual. One
of these additional failures can be handled per memory port (a total of four Chipkill
recoveries).
򐂰 Hot-add and hot-swap memory
The x366 supports the replacing of failed DIMMs while the server is still running. This
hot-swap support works in conjunction with memory mirroring. The server also supports
adding additional memory while the server is running. Adding memory requires operating
system support.
These two features are mutually exclusive. Hot-add requires that memory mirroring be
disabled and hot-swap requires that memory mirroring be enabled. These features are
discussed in “Hot-swap memory” on page 14 and “Hot-add memory” on page 15.
In addition, to maintain the highest levels of system availability, if a memory error is detected
during POST or memory configuration, the server can automatically disable the failing
memory bank and continue operating with reduced memory capacity. You can manually
re-enable the memory bank after the problem is corrected by using the Setup menu in the
BIOS.
Memory mirroring, Chipkill, and Memory ProteXion provide multiple levels of redundancy to
the memory subsystem. Combining Chipkill with Memory ProteXion allows up to two memory
chip failures per memory port on the x366, for a total of eight failures sustained.
1. The first failure detected by the Chipkill algorithm on each port does not generate a light
path diagnostics error, as Memory ProteXion recovers from the problem automatically.
2. Each memory port could then sustain a second chip failure without shutting down.
3. Provided that memory mirroring is enabled, the third chip failure on that port would send
the alert and take the DIMM offline, but keep the system running out of the redundant
memory bank.
Memory mirroring
Memory mirroring is available on the x366 for increased fault tolerance. Memory mirroring is
operating system independent, since all mirroring activities are handled by the hardware.