HP XC System Software Administration Guide Version 3.1
SFS
Nagios host, 102, 105
sftp command, 40
shownode command, 32, 37, 52–53, 57–58, 77–79, 88
shownode metrics command, 85–87
si_cpimage command, 137
Simple Linux Utility for Resource Management (see
SLURM)
sinfo command, 169, 241
single system view, 38
SLURM, 26, 32, 157
configuration files, 37
deactivating, 174
draining nodes, 170
Pluggable Authentication Module, 165
recognizing new node, 173
removing, 174
troubleshooting, 240
SLURM administration, 157
SLURM configuration, 158
nodes, 160
partitions, 161
servers, 159
system interconnect support, 159
SLURM daemon, 157
SLURM monitoring, 169
SLURM server
backup, 159
primary, 159
slurm.conf file, 165, 167, 190–191
slurmctld daemon, 157
slurmd daemon, 157
SMART, 85
smartd daemon, 85
software distribution
procedure, 129
software patches
installing and distributing, 130
software RAID, 211
boot block maintenance, 248
disk replacement, 246
documentation, 22
error reporting, 213
installation, 211
logical devices, 211
mdadm utility, 22
overview, 211
partitions, 211
removing from client nodes, 213
requirements for implementing Software RAID-1, 211
Software RAID-0, 211
software RAID-0, 211
Software RAID-1, 211
software RAID-1, 211
squeue utility, 169
ssh, 147
port, 143
transport mechanism, 33
ssh command, 40
ssh keys, 151
troubleshooting mismatched, 229
ssh_create_shared_keys command, 32, 151
startsys command, 32, 50–51, 53, 135, 139, 242, 244–246
stopsys command, 32, 52, 139
striping, 211
Supermon, 56–57, 83–84, 101
supermond service, 57
superuser password
changing, 152
swmlogger daemon, 223
sys_check utility, 32, 41, 215
syslog service, 56, 83, 87
syslog-ng
configuration files, 37
syslog-ng rules files
modifying, 88
templates, 88
syslog-ng service, 83–84
syslog-ng.conf rules file, 87–88
syslogng_forward service, 57, 87–88
system environment data, 85
system event log, 119
system event log messages
rules file, 118
system interconnect
troubleshooting, 235
system statistics, 85
systemconfigurator command, 248
SystemImager, 26, 38, 137
configuration files, 37
T
/tmp directory, 28
transfer_from_avail command, 32, 47
in a full imaging installation, 139
shutting down the system, 52
transfer_to_avail command, 32, 47
imaging and starting nodes, 51
in a full imaging installation, 139
starting nodes, 51
troubleshooting
general, 229
InfiniBand, 238
LSF-HPC with SLURM, 241
mismatched ssh keys, 229
Myrinet system interconnect, 235
Nagios, 229–232
Quadrics system interconnect, 236
SLURM, 240
system interconnect, 235
U
unit identifier LED, 53
updateimage command, 32, 136, 138
updgi_exclude_file, 139
user accounts (see local user accounts)
user authentication, 165
/usr directory, 28
288 Index