Technical Product Specification
Intel® Server Boards S4600LH2/T2 TPS
Revision 2.0
31
3.2.2.5.4 Patrol Scrubbing for ECC Memory
Patrol scrubs are intended to ensure that data with a correctable error does not remain in DRAM long enough
to stand a significant chance of further corruption to an uncorrectable stage.
3.2.2.5.5 Rank Sparing Mode
Rank Sparing Mode enhances the system’s RAS capability by “swapping out” failing ranks of DIMMs. Rank
Sparing is strictly channel and rank oriented. Each memory channel is a Sparing Domain.
For Rank Sparing to be available as a RAS option, there must be 2 or more single rank or dual rank DIMMs, or
at least one quad rank DIMM installed on each memory channel.
Rank Sparing Mode is enabled/disabled in the Memory RAS and Performance Configuration screen in the
<F2> Bios Setup Utility
When Sparing Mode is operational, for each channel, the largest size memory rank is reserved as a “spare”
and is not used during normal operations. The impact on Effective Memory Size is to subtract the sum of the
reserved ranks from the total amount of installed memory.
Hardware registers count the number of Correctable ECC Errors for each rank of memory on each channel
during operations and compare the count against a Correctable Error Threshold. When the correctable error
count for a given rank hits the threshold value, that rank is deemed to be “failing”, and it triggers a Sparing Fail
Over (SFO) event for the channel in which that rank resides. The data in the failing rank is copied to the Spare
Rank for that channel, and the Spare Rank replaces the failing rank in the IMC’s address translation registers.
An SFO Event is logged to the BMC SEL. The failing rank is then disabled, and any further Correctable Errors
on that now non-redundant channel will be disregarded.
The correctable error that triggered the SFO may be logged to the BMC SEL, if it was the first one to occur in
the system. That first correctable error event will be the only one logged for the system. However, since each
channel is a Sparing Domain, the correctable error counting continues for other channels which are still in a
redundant state. There can be as many SFO Events as there are memory channels with DIMMs installed.
3.2.2.5.6 Mirrored Channel Mode
Channel Mirroring Mode gives the best memory RAS capability by maintaining two copies of the data in main
memory. If there is an Uncorrectable ECC Error, the channel with the error is disabled and the system
continues with the “good” channel, but in a non-redundant configuration.
For Mirroring mode to be to be available as a RAS option, the DIMM population must be identical between
each pair of memory channels that participate. Not all channel pairs need to have memory installed, but for
each pair, the configuration must match. If the configuration is not matched up properly, the memory operating
mode falls back to Independent Channel Mode.
Mirroring Mode is enabled/disabled in the Memory RAS and Performance Configuration screen in the <F2>
BIOS Setup Utility.
When Mirroring Mode is operational, each channel in a pair is “mirrored” by the other channel. The impact on
Effective Memory size is to reduce by half the total amount of installed memory available for use.
When Mirroring Mode is operational, the system treats Correctable Errors the same way as it would in
Independent channel mode. There is a correctable error threshold. Correctable error counts accumulate by
rank, and the first event is logged.
What Mirroring primarily protects against is the possibility of an Uncorrectable ECC Error occurring with critical
data “in process”. Without Mirroring, the system would be expected to “Blue Screen” and halt, possibly with