HP-UX HB v13.00 Ch-08 - Crash Dumps
HP-UX Handbook – Rev 13.00 Page 6 (of 38)
Chapter 08 Crash Dumps
October 29, 2013
Getting an HPMC does not always mean that the hardware is at fault. The HPMC tombstone
needs to be analyzed to determine if the hardware was really at fault. Software defects can result
in HPMC crash events, but are typically very rare in production quality software.
NOTE: on Itanium systems the naming is slightly different:
HPMC = MCA (Machine Check Abort)
TOC = INIT
What happens when a system crashes?
Now that you understand the different types of crash events (panic, toc, and hpmc), let’s see
what the system does to process these events. Processing these events usually requires an
interaction between the hardware and operating system software. There are well defined
architected interfaces between hardware and software. For example, PDC entry points (processor
firmware) on the processors and Interruption Vector Table (IVA) in the kernel. These interfaces
allows the hardware to trigger software entry points to initiate logging, analysis and error
recovery to be performed after a hardware fault or vice versa.
Some of the information presented here may be quite indepth on first reading. You may skim
through them initially. It is important to grasp the concept presented here since any investigative
dump analysis work begins with the crash events. It is worthwhile understanding what the
system does in response to crash events and what crucial pieces of information are saved and
where they are stored.
We categorize the crash events into two classes namely hardware crash events and software
crash events. Here is a description of what the system does to process these.
Hardware crash events
A hardware crash event can be High Priority Machine Check (HPMC), Low Priority Machine
Check (LPMC) or Transfer of Control (TOC). The machine checks are typically caused by
hardware malfunctions or certain classes of bus errors. TOC on the other hand is usually initiated
by the operator in response to system software being stuck in an error state.
When a hardware crash event occurs, the processor immediately branches to the PDC entry point
PDCE_CHECK (for HPMC and LPMC faults) or PDCE_TOC (for TOC). The implementation
details of these PDC entry points are processor dependent. Fundamentally they save the
processor’s state (general, control, space and interruption registers) into Processor Internal
Memory (PIM). The processor then vectors back into the operating system entry points;
HPMC_Vector or TOC_Vector. These entry points are defined in the IVA (Interruption Vector
Table) and MEM_TOC in Page Zero respectively.
On entry into the kernel, a crash event entry is created. The operating system makes a pdc call
(PDC_PIM) to read the processor’s state information from PIM into a Restart Parameter Block
(RPB). As such the RPB structure contains information pertinent to the understanding of the
crash. For example, the Program Counter (PC) in the RPB would indicate what routine was