HP XC System Software Administration Guide Version 4.0

Table Of Contents
monitoring, 203
resource information, 202
short RUN_WINDOW for queue, 264
shutting down, 197
starting up, 196
troubleshooting, 263–265
LSF with SLURM failover, 204, 264
running jobs, 205
LSF with SLURM integration, 190
LSF with SLURM interplay, 212
LSF with SLURM jobs
controlling, 198
monitoring, 198
lsf.conf file, 207
lshosts command, 202
lsload command, 202
LVS, 39, 58
director service, 24
M
manage_enclosure command, 31
manage_mcs_status command, 307, 308
managedb command, 31
archive, 81
backup, 81
dump, 83
purge, 82
restore, 81, 82
management hub services, 25
management nodes, 24
managing licenses, 77
manpages, 21
MCS, 307–310
log files, 309
MCS cluster monitor, 309
MCS traps monitor, 309
MCS device
as Nagios host, 309
monitored by Nagios, 307
status, 307
mcs.ini file, 307
mcs_config command, 308
mcs_local.cfg file, 307
mcs_trends.log file, 309
mcs_trends.staticdb file, 309
mdadm command, 230, 274
examining RAID array, 230
mdadm utility, 21
mirroring, 229
modifying a local user account, 160
Modular Cooling System (see MCS)
modulefile
loading, 40, 215
managing, 40, 215
unloading, 40, 215
viewing available, 40, 215
viewing loaded, 40, 215
monitoring
hierarchy, 86
strategy, 86
monitoring SLURM, 182
monitoring tools, 85
mounting file systems, 217
MPICH, 305–306
MUNGE authentication package, 262
Myrinet system interconnect
administrative password, 163
diagnostic tools, 240
troubleshooting, 254–255
MySQL, 25, 35, 79
accessing, 79
cannot connect to MySQL server, 247
N
Nagios, 59, 105–130, 182
changing default user name, 119
configuration files, 37
configuring, 120
customizing for MCS monitoring, 307
default alert message format, 125
default settings, 123
determining status of nagios service, 249
files, 107
global settings, 118
host, 106, 110
log files, 249
LSF monitoring, 203
LSF with SLURM monitoring, 203
main window, 107
MCS monitoring, 307
menu, 87
messages reported by, 251–254
optional configuration, 120
restarting, 115
Service Detail View, 111
Service Problems View, 112
stopping, 115
Tactical Overview, 110
troubleshooting, 249
updating configuration, 115
views, 110
web interface, 107
Nagios alert messages, 125
default format, 125
forwarding, 116
Nagios plug-in, 120, 121
disabling, 120
MCS, 309
running manually, 250
Nagios report generator utility (see nrg utility)
nagios_monitor service, 59
nagios_vars.ini file, 117
MCS monitoring, 307
Nan, 127
nand daemon (see Nan)
NAT, 39
administration, 131
client, 131
322 Index