Specification Update

20 Intel
®
Xeon
®
Processor 7000 Series
Specifiication Update, March 2010
If an I/O instruction (IN, INS, REP INS, OUT, OUTS, or REP OUTS) is being
executed, and if the data for this instruction become corrupted, the processor will
signal a Machine Check Exception (MCE). If the instruction is directed at a device
that is powered down, the processor may also receive an assertion of SMI#. Since
MCEs have higher priority, the processor will call the MCE handler, and the SMI#
assertion will remain pending. However, while attempting to execute the first
instruction of the MCE handler, the SMI# will be recognized and the processor will
attempt to execute the SMM handler. If the SMM handler is successfully completed,
it will attempt to restart the I/O instruction, but will not have the correct machine
state due to the call to the MCE handler. This can lead to failure of the restart and
shutdown of the processor.
If PWRGOOD is de-asserted during a RESET# assertion causing internal glitches,
the MCA registers may latch invalid information.
If RESET# is asserted, then de-asserted, and reasserted, before the processor has
cleared the MCA registers, then the information in the MCA registers may not be
reliable, regardless of the state or state transitions of PWRGOOD.
If MCERR# is asserted by one processor and observed by another processor, the
observing processor does not log the assertion of MCERR#. The Machine Check
Exception (MCE) handler called upon assertion of MCERR# will not have any way to
determine the cause of the MCE.
The Overflow Error bit (bit 62) in the IA32_MC0_STATUS register indicates, when
set, that a machine check error occurred while the results of a previous error were
still in the error reporting bank (i.e. The Valid bit was set when the new error
occurred). If an uncorrectable error is logged in the error-reporting bank and
another error occurs, the overflow bit will not be set.
The MCA Error Code field of the IA32_MC0_STATUS register gets written by a
different mechanism than the rest of the register. For uncorrectable errors, the
other fields in the IA32_MC0_STATUS register are only updated by the first error.
Any further errors that are detected will update the MCA Error Code field without
updating the rest of the register, thereby leaving the IA32_MC0_STATUS register
with stale information.
When a speculative load operation hits the L2 cache and receives a correctable
error, the IA32_MC1_Status Register may be updated with incorrect information.
The IA32_MC1_Status Register should not be updated for speculative loads.
The processor should only log the address for L1 parity errors in the
IA32_MC1_Status register if a valid address is available. If a valid address is not
available, the Address Valid bit in the IA32_MC1_Status register should not be set.
In instances where an L1 parity error occurs and the address is not available
because the linear to physical address translation is not complete or an internal
resource conflict has occurred, the Address Valid bit is incorrectly set.
The processor may hang when an instruction code fetch receives a hard failure
response from the Front Side Bus. This occurs because the bus control logic does
not return data to the core, leaving the processor empty. IA32_MC0_STATUS MSR
does indicate that a hard fail response occurred.
The processor may hang when the following events occur and the machine check
exception is enabled, CR4.MCE=1. A processor that has it’s STPCLK# pin asserted will
internally enter the Stop Grant State and finally issue a Stop Grant Acknowledge
special cycle to the bus. If an uncorrectable error is generated during the Stop Grant
process it is possible for the Stop Grant special cycle to be issued to the bus before the
processor vectors to the machine check handler. Once the chipset receives its last Stop
Grant special cycle it is allowed to ignore any bus activity from the processors. As a
result, processor accesses to the machine check handler may not be acknowledged,
resulting in a processor hang.
Implication: The processor is unable to correctly report and/or recover from certain errors