User`s guide

The XMI
2.9.3 Error Recovery
Error recovery involves one or more reattempts of the failed transaction
before reporting a hard error. A failed XMI transaction is retried under
the following circumstances:
All transactions receiving a NO ACK confirmation for the command
cycle are retried automatically by the hardware. The NO ACK can
result from either a reference to nonexistent memory locations (NXM)
or from bus parity errors. Transactions failing the retry are assumed
to be to an NXM.
Failing XMI Write transactions are retried.
Failing XMI Read transactions to memory space are retried.
XMI IDENT transactions receiving a response timeout may be retried.
Since this may result in a lost interrupt vector, the consequences are
implemented by software.
Failing XMI I/O space Write Mask or Unlock Write Mask transactions
are retried.
Failing DWMBB I/O space Read or Interlock Read transactions
receiving a response timeout are NOT retried since some I/O devices
might have read side effects.
2.9.4 Error Reporting
Normal transaction-level error reporting mechanisms include NO ACK,
Read Error Response (RER), and timeout.
The XMI bus protocol supports two mechanisms that signal error
conditions to processors if normal transaction-level error reporting cannot
be used. They are:
Write error interrupt—This transaction is directed to one or more
CPU nodes, resulting in each targeted CPU taking an IPL 1D (hex)
error interrupt. The CPU then identifies the source of the write error
interrupt.
XMI TRIGGER—When XMI TRIGGER is asserted, all XMI CPUs take
an IPL 1D (hex) error interrupt. This is used for diagnostic purposes.
Examples of error conditions include:
System integrity problems, such as bus collisions.
The DWMBB being unable to complete an XMI-to-VAXBI windowed
write operation. The DWMBB issues a write error IVINTR transaction
to the nodes designated in the WE IVINTR destination register. If the
cause of the error is nonexistent memory (NXM), such as during
configuration, then software tries recovery. Otherwise, software
initiates a system software failure.
2–79