System information

Error Log Analysis
5-3
5.2 Machine Checks/Interrupts
The exceptions that result from hardware system errors are called machine
checks/interrupts. They occur when a system error is detected during the processing
of a data request. Four types of machine checks/interrupts are related to system
events:
Processor machine check (SCB 670)
System machine check (SCB 660)
Processor-detected correctable error (SCB 630)
System-detected nonfatal error (SCB 620)
NOTE: A fan failure is a fatal, noncorrectable error, but is reported as nonfatal to
allow the operating system to perform shutdown.
During the error-handling process, errors are first handled by the appropriate
PALcode error routine and then by the associated operating system error handler.
The causes of each of the machine check/interrupts are as follows. The system
control block (SCB) vector through which PALcode transfers control to the
operating system is shown in parentheses.
Processor Machine Check (SCB: 670)
Processor machine check errors are fatal system errors that result in a system crash.
The error-handling code for these errors is common across all platforms using the
Alpha 21164 microprocessor.
I-cache data or tag parity error
S-cache data parity error—I-stream
S-cache tag parity error—I-stream
S-cache data parity error—D-stream Read/Read, READ_DIRTY
S-cache tag parity error—D-stream or system commands
D-cache data parity error
D-cache tag parity error
I-stream uncorrectable ECC data parity errors (B-cache or memory)
D-stream uncorrectable ECC data parity errors (B-cache or memory)
B-cache tag parity errors—I-stream
B-cache tag parity errors—D-stream
System command/address parity error