Fault Monitoring on Windows Integrity Servers

Error type
[Hardware]
Threshold Message
Montecito
processors]
Severity=Warning
HelpText=The processor has experienced an excessive
number of persistent correctable errors in its third level cache
and as a result performance has degraded. Normally the
processor can dynamically deal with this type of error with
no impact on performance. However, if too many such
errors occur then they begin to impact performance. The
processor is still able to operate correctly but its performance
is degraded so it should be replaced.
Contact HP Service for assistance.
Front Side Bus
error
[sx1000, sx2000
and zx2 systems]
80 errors in 24 hours Event ID 7324
Bad cell board (for cellular system) or bad processor board
(for non-cellular system).
Severity=Warning
HelpText=Bad cell board (on cellular system), or bad
processor board (on non-cellular system).
Contact your HP support representative to check the cell
board or the processor board on which the specified
processor module is located.
MCA Monitor
This component is an exclusive feature of the HP Integrity server and runs as a software
service under Windows. The service monitors servers for machine checks. MCA Monitor
gathers and maintains a log of the machine checks for use by service and support personnel.
Type of Errors
MCA Monitor captures and saves raw data from CPE and CMC records. Windows receives
these records from server firmware through the SAL_GET_STATE_INFO call.
Methodology
The MCA monitor (hpmcalog.exe) registers with WMI service to receive CPE and CMC
records, as PFM service does. Unlike PFM service, MCA Monitor does not analyze the
records it receives; instead, it saves the raw data in binary files. Field service personnel can
analyze these files using the MCA analyzer tool when more details about the error are
needed.
Output
Binary log files are saved in the directory:
%windir%\system32\cpqmgmt\cqmgserv\hpmcalog
Server Agent
Server Agent collects and reports hardware configuration data and monitors the server
environmental and power subsystem health status. On a cellular system, information
collected includes cabinets, partitions, cells, and I/O chassis configuration. On all systems,
Page 8