Specifications
168 IBM Power 770 and 780 Technical Overview and Introduction
The traditional means of handling these problems is through adapter internal-error reporting
and recovery techniques, in combination with operating system device-driver management
and diagnostics. In certain cases, an error in the adapter can cause transmission of bad data
on the PCI bus itself, resulting in a hardware-detected parity error and causing a global
machine check interrupt, eventually requiring a system reboot to continue.
PCI enhanced error handling-enabled adapters respond to a special data packet that is
generated from the affected PCI slot hardware by calling system firmware, which examines
the affected bus, allows the device driver to reset it, and continues without a system reboot.
For Linux, enhanced error handling (EEH) support extends to the majority of frequently used
devices, although various third-party PCI devices might not provide native EEH support.
To detect and correct PCIe bus errors, POWER7 processor-based systems use CRC
detection and instruction retry correction. For PCI-X, it uses ECC.
Figure 4-7 shows the location and mechanisms used throughout the I/O subsystem for
PCI-enhanced error handling.
Figure 4-7 PCI-enhanced error handling
4.2.8 POWER7 I/O chip freeze behavior
The POWER7 I/O chip implements a “freeze behavior” for uncorrectable errors borne on the
GX+ bus and for internal POWER7 I/O chip errors detected by the POWER7 I/O chip. With
this freeze behavior, the chip refuses I/O requests to the attached I/O, but does not check
stop the system. This allows systems with redundant I/O to continue operating without an
outage instead of system checkstops seen in earlier chips, such as the POWER5 I/O chip
used on POWER6 processor-based systems.
PCIe
Adapter
PCI-X
Adapter
Parity error
Parity error
I/O drawer concurrent add
CRC with
retry or ECC
PCI Bridge Enhanced
Error Handling
PCI-X to PCI-X
POWER7
12X Channel
Hub
PCI-X
Bridge
PCI-X
Bridge
POWER7
12X Channel
Hub
12X Channel –
PCIe Bridge
GX+ / GX++ bus
adapter
12x channel failover
support
PCI Bus Enhanced Error
Handling