Specifications
QSSC-S4R Technical Product Specification BIOS Error Handling
239
Address Parity errors are errors detected in the memory addressing hardware. Since these affect the addressing of
memory contents, they can potentially lead to the same sort of failures as ECC errors. They are logged as a distinct
type of error since they affect memory addressing rather than memory contents, but otherwise they are treated exactly
the same as Uncorrectable ECC Errors. Address Parity errors are logged to the BMC SEL, with Event Data to identify
the failing address by channel and DIMM to the extent that it is possible to do so.
RAS errors reported by the SMI error handler are Loss of Redundancy errors. These occur when Mirrored Mode is
active and a memory error is detected, which causes the memory controller to take one memory image out of service,
so the system memory is no longer protected against data loss by redundant memory operation.
Since processors in the Intel®
Xeon®
7500 Series include two Integrated Memory Controller, the Socket ID of the
processor can be used as the memory controller locator information. This same Socket ID is also used for the SMBIOS
Type 16 instance. One or two processor sockets may be identified on QSSC-S4R.
For a detailed description of memory sensors, see Sections 16.2.12.2.1 and 16.2.12.2.2. Memory errors and RAS are
discussed in context with memory initialization, since there are complicated interactions.
21.2.3.4 Intel® QuickPath Interconnect Sensors
Intel® QuickPath Interconnect errors detected and reported via SMI indicate that the high speed link between
processors in the Intel® Xeon® 7500 Processor Series and from the processors to the Intel® 7500 Chipset is not
operating properly. The values of Sensor Specific Offsets in the following table are currently Intel Classified
Information, but the severity of the error and the processor Socket ID indicate how serious the error was and which
processor socket was responsible.
The Intel® QuickPath Interconnect errors that may be detected and logged can be categorized into three classes:
x Correctable Errors ('QPI Corr Sensor’) – These are errors that are detected by the hardware, and are
correctable or may be retried by either hardware or software (BIOS) without affecting the integrity of continued
operations. Please note that the BIOS maintains an internal threshold of ten for QPI errors i.e. a QPI Correctible
SEL log appears only on injection of ten QPI correctible errors.
x Non-fatal/Recoverable Errors ('QPI Nfat Sensor’) – These are errors that are not correctable, but may be
recovered by a restart or reinitialization of the components involved, allowing operations to resume without losing
system integrity.
x Fatal/Non-Recoverable Errors ('QPI Fatl Sensor’) – These are errors that are neither correctable nor
recoverable. They compromise system integrity and preclude continued operations.
Table 162. Intel® QuickPath Interconnect Errors
Sensor Name Sensor
Number
Sensor
Type
E/R T
y
pe Sensor-
specific
Offset
ED1 ED2 ED3
Intel
®
Quickpath Interconnect
Correctable Errors ('QPI Corr
Sensor‘)
06h Critical
Interrupt
13h
Sensor-
specific
Discrete
72h
Intel-
reserved
7:6=10b
5:4=00b
3:0= Offset
value –
Reserved
Socket Reserved
Intel
®
Quickpath Interconnect
Non-fatal or Recoverable Errors
('QPI Nfat Sensor‘)
07h Critical
Interrupt
13h
Sensor-
specific
Discrete
73h
Intel-
reserved
7:6=10b
5:4=00b
3:0= Offset
value –
Reserved
Socket Reserved
Intel
®
Quickpath Interconnect
Fatal or Non-Recoverable
Errors ('QPI Fatl Sensor‘)
17h Critical
Interrupt
13h
Sensor-
specific
Discrete
74h
Intel-
reserved
7:6=10b
5:4=00b
3:0= Offset
value –
Reserved
Socket Reserved
Intel
®
Quickpath Interconnect
Fatal or Non-Recoverable
Errors ('QPI Fatl Sensor‘)
(Note that this Sensor is just a
logical extension of Sensor
18h Critical
Interrupt
13h
Sensor-
specific
Discrete
74h
Intel-
reserved
7:6=10b
5:4=00b
3:0= Offset
value –
Reserved
Socket Reserved