Datasheet

Reliability, Availability, Serviceability (RAS)
Intel
®
Xeon
®
Processor C5500/C3500 Series
Datasheet, Volume 1 February 2010
388 Order Number: 323103-001
11.3.3.3 First and Next Error Log Registers
This section describes local error logging for Intel
®
QuickPath Interconnect and IIO core errors, and it
describes global error logging. The log registers are named *FERR and *NERR in the IIO Register
Specification. PCIe specifies its own error logging mechanism, This will not be described here. See the
PCIe specification for details.
For error logging, the IIO categorizes detected errors into Fatal and Non-Fatal based on the error
severity: Fatal for severity 2, Non-fatal for severity 0 and 1. Each category includes two sets of error
logging: FERR (first error register) and NERR (next error register). The FERR register stores the
information associated with the first detected error and NERR stores the information associated with
subsequent errors.
Both FERR and NERR log the error status in the same format. They indicate errors that can be
detected by the IIO in the format bit vector with one bit assigned to each error. The first error event is
indicated by setting the corresponding bit in the FERR status register, a subsequent error(s) is
indicated by setting the corresponding bit in the NERR register. In addition, the local FERR registers
logs the ECC syndrome, address, and header of the erroneous cycle. The FERR indicates only one
error, while the NERR can indicate multiple errors. Both the first error and next errors trigger system
events.
Once the first error and the next error have been indicated and logged, the log registers for that error
remain valid until either: 1) The first error bit is cleared in the associated error status register, or 2) a
powergood reset occurs. Software clears an error bit by writing 1 to the corresponding bit position in
the error status register.
The hardware rules for updating the FERR and NERR registers and error logs are as follows:
1. The first error event is indicated by setting the corresponding bit in the FERR status register. A
subsequent error is indicated by setting the corresponding bit in the NERR status register.
2. If the same error occurs before the FERR status register bit is cleared, it is not logged in the NERR
status register.
3. If multiple error events, sharing the same error log registers, occur simultaneously, then highest
error severity has priority over the others for FERR logging. The other errors are indicated in the
NERR register.
4. A fatal error has the highest priority, followed by recoverable errors, and then correctable errors.
5. Updates to the error status and error log registers appear atomic to the software.
6. Once the first error information is logged in the FERR log register, the logging of FERR log
registers is disabled until the corresponding FERR error status is cleared by software.
7. Error control registers are cleared by reset. The error status and log registers are cleared only by
the power-on reset. The contents of error log registers are preserved across a reset, while
PWRGOOD remains asserted.
11.3.3.4 Error Logging Summary
The following flow chart summarizes the error logging flow for the IIO. As illustrated in the flow chart,
the left half depicts the local error logging flow and the right half depicts the global error logging flow.
The local and the global error logging are similar. For simultaneous events, the IIO serializes the
events with higher priority on the more severe error.