HP XC System Software Administration Guide Version 2.1
• shownode metrics paging
• shownode metrics sensors (supp orted only o n the X C60 00 only)
• shownode metrics swap
You can use either the pdsh command or the cexec com
mand to execute these commands
remotely on another node. See Section 1.4.3 for m
ore information.
See Section 6.4 and the shownode
(8) manpage fo r more information.
6.2.2 Nagios
The HP XC System Software uses the Nagios Op en Sou rce monitoring application to gather
and display system statistics, such as processor load an d disk usage. Nagios is a system and
network monit oring appli cation. It watche s hosts and services that you specify and alerts
you when problems occur or are resolved. On t he HP XC system, Nagios is integrated with
Supermon for monitoring capabil ities.
The HP XC system au tom atically configures the Nagios environment based on the configuration
of the system. The autoconfiguration is based on the information in the HP XC configuration
and management database (cmdb). The configuration is updated as a result of changes to the
HP XC database. Autoconfigu ration includes setting up Nagios configuration templates mapped
to the configuration for both hosts and services.
Nagios reports information coll
ected using the Supermon infrastructure. T he data collected by
Supermon includes b oth system pe
rformance m etrics and environmental data, such as f an,
temperature, and power supply s
tatus. This data is collected on a regular basis by the Supermon
daemon on the head node.
Table 6-1 lists the services monitored by N agios and what type of func tio n is mon itored for
that service.
Table 6-1: Monitored Services
Service
Function
Comments
SLURM
Is alive
Nagios Is alive and accepting requests
Syslog-ng Is alive
Apache Is alive and accepting requests
DHCP
Is alive and accepting requests
Ethernet
switches
Simple status
syslog events Various
Supermon
Various
Performance Various Based on Supermon gathered data
Hosts
Is alive
Database Is alive
Pdsh Is alive
Supermon
mond
Is alive and accepting requests
6-2 Monitoring the System