User manual
32 SPARC Enterprise T1000 Server Administration Guide • April 2007
Error Handling Summary
Error handling during the power-on sequence falls into one of the following three
cases:
■ If no errors are detected by POST or OpenBoot Diagnostics, the system attempts
to boot if auto-boot? is true.
■ If only nonfatal errors are detected by POST or OpenBoot Diagnostics, the system
attempts to boot if auto-boot? is true and auto-boot-on-error? is true.
Nonfatal errors include the following:
■ Ethernet interface failure.
■ Serial interface failure.
■ PCI-Express card failure.
■ Memory failure. When a DIMM fails, the firmware unconfigures the entire
logical bank associated with the failed module. Another nonfailing logical
bank must be present in the system for the system to attempt a degraded boot.
Note that certain DIMM failures might not be diagnosable to a single DIMM.
These failures are fatal, and result in both logical banks being unconfigured.
Note – If POST or OpenBoot Diagnostics detect a nonfatal error associated with the
normal boot device, the OpenBoot firmware automatically unconfigures the failed
device and tries the next-in-line boot device, as specified by the boot-device
configuration variable.
■ If a fatal error is detected by POST or OpenBoot Diagnostics, the system does not
boot regardless of the settings of auto-boot? or auto-boot-on-error?. Fatal
nonrecoverable errors include the following:
■ Any CPU failed
■ All logical memory banks failed
■ Flash RAM cyclical redundancy check (CRC) failure
■ Critical field-replaceable unit (FRU) PROM configuration data failure
■ Critical system configuration SEEPROM read failure
■ Critical application-specific integrated circuit (ASIC) failure
For more information about troubleshooting fatal errors, refer to the service manual
for your server.
Reset Scenarios
Three ALOM CMT configuration variables, diag_mode, diag_level, and
diag_trigger, control whether the system runs firmware diagnostics in response
to system reset events.