System information
Error Log Analysis
5-3
5.2 Machine Checks/Interrupts
The exceptions that result from hardware system errors are called machine
checks/interrupts. They occur when a system error is detected during the processing
of a data request. Four types of machine checks/interrupts are related to system
events:
• Processor machine check (SCB 670)
• System machine check (SCB 660)
• Processor-detected correctable error (SCB 630)
• System-detected nonfatal error (SCB 620)
NOTE: A fan failure is a fatal, noncorrectable error, but is reported as nonfatal to
allow the operating system to perform shutdown.
During the error-handling process, errors are first handled by the appropriate
PALcode error routine and then by the associated operating system error handler.
The causes of each of the machine check/interrupts are as follows. The system
control block (SCB) vector through which PALcode transfers control to the
operating system is shown in parentheses.
Processor Machine Check (SCB: 670)
Processor machine check errors are fatal system errors that result in a system crash.
The error-handling code for these errors is common across all platforms using the
Alpha 21164 microprocessor.
• I-cache data or tag parity error
• S-cache data parity error—I-stream
• S-cache tag parity error—I-stream
• S-cache data parity error—D-stream Read/Read, READ_DIRTY
• S-cache tag parity error—D-stream or system commands
• D-cache data parity error
• D-cache tag parity error
• I-stream uncorrectable ECC data parity errors (B-cache or memory)
• D-stream uncorrectable ECC data parity errors (B-cache or memory)
• B-cache tag parity errors—I-stream
• B-cache tag parity errors—D-stream
• System command/address parity error