Fault Monitoring on Windows Integrity Servers

%windir%\system32\cpqmgmt\cqmgserv\hppfmlogs
The filename syntax in this directory allows field service personnel to easily identify the type
of error and the timestamp.
Summary of PFM Information and Message Output
Note: PFM thresholds are consistent for operating systems supported by HP Integrity servers
(Windows 2003 and HP_UX).
Error type
[Hardware]
Threshold Message
Internal cache
error
[All systems / all
processors]
100 errors in 24 hours Event ID 5823
Cache errors detected on a processor
Severity=Warning
HelpText=Threshold parity errors have been detected in the
Instruction or Data Cache Memory (I-Cache or D-Cache).
The operating system has recovered from the errors, but this
is an abnormally high failure rate
Contact your HP support representative to check the
processor
External cache
error
[Systems with mx2
processors]
Any of the thresholds below in
a 2-week period:
Single processor
>= 512 errors
in distinct addresses
More than one processor
>= 2 and < 512 errors
in distinct addresses
Event ID 5824
Corrected errors detected in the cache portion of the memory
for a processor module
Severity=Warning
HelpText=Threshold corrected platform errors have been
detected in the cache portion of the memory for the
processor module. The operating system has recovered from
the errors, but this is an abnormally high failure rate.
Contact your HP support representative to check the
processor
Per DIMM: 20 errors in 24
hours
Event ID 4652
Predictive Failure in Memory
Severity=Warning
HelpText=You will receive this message if the memory system
is observing a lot of corrected ECC errors from a DIMM. The
specified DIMM may need to be serviced.
Contact your HP support representative to check the affected
hardware.
Memory SBE
[zx1, sx1000 and
sx2000 systems]
Per system: 24 errors in 24
hours
Event ID 5814
Significant numbers of corrected memory errors have been
detected on the memory subsystem
Severity=Warning
HelpText=You will receive this message if the system is
observing a lot of corrected ECC errors in memory. This
could be caused by problems with the system's memory or
by unexpected environmental conditions inside the server.
Contact your HP support representative to check your
memory system.
Memory DBE Per DIMM: 20 errors in 24 Event ID 4652
Page 5