Datasheet

ManualsBrandsIntel Manualscomputer componentsAT80612002931AB

Intel

Xeon

Processor C5500/C3500 Series

February 2010 Datasheet, Volume 1

Order Number: 323103-001 51

Interfaces

MC_SMI_SPARE_CNTRL

to which the

counters are compared. If any counter exceeds the threshold, the enabled interrupt will

be generated, and status bits are set to indicate which counter met threshold.

2.1.7.3 Identifying the Cause of An Interrupt

Table 15 defines how to determine what caused the interrupt.

2.1.8 Single Device Data Correction (SDDC) Support

The Integrated Memory Controller employs a Single Device Data Correction (SDDC)

algorithm that will recover from a x4/x8 component failure. In addition the Integrated

Memory Controller supports demand and patrol scrubbing.

A scrub corrects a correctable error in memory. A four-byte ECC is attached to each

32-byte “payload”. An error is detected when the ECC calculated from the payload

mismatches the ECC read from memory. The error is corrected by modifying either the

ECC or the payload or both and writing both the ECC and payload back to memory.

Only one demand or patrol scrub can be in process at a time.

2.1.9 Patrol Scrub

Patrol scrubs are intended to ensure that data with a correctable error does not remain

in DRAM long enough to stand a significant chance of further corruption to an

uncorrectable error due to particle error. The Integrated Memory Controller will issue a

Patrol Scrub at a rate sufficient to write every line once a day. For a maximum capacity

of 64 GB, this would be one scrub every 82 ms. The Sparing/Scrub (SS) engine sends

scrubs to one channel at a time. The Patrol Scrub rate is configurable. The scrub engine

will scrub all active channels which includes the spare channel. The spare channel will

be scrubbed and errors will be signaled and logged if errors are enabled.

Table 15. Causes of SMI or NMI

Condition Cause

Recommended platform software

response.

MC_SMI_SPARE_DIMM_ERROR_STATUS.

DIMM_ERROR_OVERFLOW_STATUS != 0

This register has one bit for each

DIMM error counter that meets

threshold.

This can happen at the same time

as any of the other SMI events

(Sparing complete, redundancy

lost in Mirror Mode). It is

recommended that software

address one, so that the other

cause remains when the second

event is taken.

Examine the associated

MC_COR_ECC_CNT_X register. Determine

the time since the counter has been cleared.

If a spare channel exists, and the threshold

has been exceeded faster than would be

expected given the background rate of

correctable errors, Sparing should be

initiated. The counter should be cleared to

reset the overflow bit.

MC_RAS_STATUS.REDUNDANCY_LOSS = 1

One channel of a mirrored pair had

an uncorrectable error and

redundancy has been lost.

Raise an indication that a reboot should be

scheduled, possibly replace the failed DIMM

specified in the

MC_SMI_SPARE_DIMM_ERROR_STATUS

MC_SSRSTATUS. CMPLT = 1

A sparing copy operation set up by

software has completed.

Advance to the next step in the sparing flow.