User manual
Miscellaneous Notes
70
In certain cases CPUSPINWAIT bugcheck can be triggered because the value of SMP timeout is
set too high. If processor calibrates as fast too, computed counter of maximum spin-wait loop
iterations can overflow.
Counter of maximum spin-wait loop iterations is computed as
EXE$GL_TENUSEC * EXE$GL_UBDELAY * timeout in 10-usec units
Counter is 32-bit signed integer, i.e. the product of three parameters in the formula must not
exceed 0x7FFFFFFF. If it exceeds this maximum value, OpenVMS is likely to falsely signal SMP
timeout and raise CPUSPINWAIT bugcheck.
In fact, the reason for SMP_LNGSPINWAIT reduction from 30 seconds to 9.5 seconds is exactly
that absurdly high value of 30 seconds causes overflow of spin-wait cycles counter on
sufficiently fast host machines.
In general, there should be no legitimate reason for SMP timeouts to be longer than 1 second
(and that’s still with very wide safety margin), even for “long” timeouts with for low-IPL spinlocks
that may allow nested spinlock acquisitions, and even much less for high-IPL spinlocks.
VSMP LOAD command will check values of SMP_SPINWAT and SMP_LNGSPINWAIT to ensure
they do not cause overflow on given host. If they do, VSMP will display advisory message to
reduce their values, including advised new values, and refuse to load and enable
multiprocessing.
It might be possible (though exceedingly unlikely) for layered privileged components, possibly
including 3
rd
party products, to utilize private values for their SMP timeouts, such as for private
spinlocks. If these values are set too high, they can cause spin-wait counter overflow and trigger
CPUSPINWAIT bugcheck. If this unlikely situation is ever encountered, SMP timeouts for such
product would need to be adjusted.
OpenVMS bugcheck BADQHDR
If you observe OpenVMS bugcheck with code BADQHDR (“Interlocked queue header corrupted”)
this may indicate either a possible bug in VAX MP or possible OpenVMS bug, perhaps made
more prone to being triggered when OpenVMS is executed on a simulator. Please report the
crash as described in the entry for CPUSPINWAIT bugcheck above.
The reasons for particular bugcheck need to be investigated on case-by-case basis, however the
following remedies may be available against conditions that lead to BADQHDR.
If you are executing VAX MP using native interlock mode (@VSMP$LOAD INTERLOCK=NATIVE)
try switching either to portable mode or portable mode with synchronization window
additionally enabled.
BADQHDR can be generated in three principal cases:










