HP XC System Software Administration Guide Version 4.0

Table Of Contents
15.8 Maintaining the SLURM Daemon Log
By default SLURM daemon logs are stored in /var/slurm/log/ on each node that runs SLURM
daemons. The slurmctld controller daemon writes to the slurmctld.log file, and the slurmd
daemon writes to the slurmd.log file. These log files and their location are configured in the
slurm.conf file. You can view this information with the scontrol command, as follows:
# scontrol show config | grep LogFile
SlurmctldLogFile = /var/slurm/log/slurmctld.log
SlurmdLogFile = /var/slurm/log/slurmd.log
Over time these logs become large, particularly if you increase SLURM daemon debugging:
# scontrol show config | grep -i debug
SlurmctldDebug = 3
SlurmdDebug = 3
The daemon debug value ranges from 1 to 7, with 7 being very verbose. The default value is 3.
To cache these log files without disrupting SLURM operation, rename these files. Be sure the
new names are intuitive if you intend to archive them:
# mv /var/slurm/log/slurmctld.log{,.old}
# mv /var/slurm/log/slurmd.log{,.old}
Use the pdsh command to rename the files systemwide:
# scontrol ping
Slurmctld(primary/backup) at n16/n15 are UP/UP
# pdsh -w n[15-16] 'mv /var/slurm/log/slurmctld.log{,.old}'
# pdsh -a 'mv /var/slurm/log/slurmd.log{,.old}'
The SLURM daemons will still write to the renamed files. To have the daemons write to the new
daemon log files, issue the following command:
# scontrol reconfig
Now the SLURM daemons will write to the originally named log files. You can archive or delete
the old files.
You can automate the procedure for caching SLURM log files by using a cron job on the head
node set for an interval appropriate for your site.
15.9 Enabling SLURM to Recognize a New Node
Use the following procedure to enable SLURM to recognize a new node, that is, a node known
to the HP XC system but not managed by SLURM.
This procedure adds node n9 to the SLURM lsf partition, which already consists of nodes n1
through n8.
1. Log in to the head node as the superuser (root).
2. Log in to the node to be added to gather information on the node's characteristics:
a. Log in to the node:
# ssh n9
Last login: date and time stamp from node name
Linux for High Performance Computing
This product is based on Red Hat Enterprise Linux version version source
packages found on ftp.redhat.com. Red Hat(R) is a registered
trademark of Red Hat, Inc. This disc is not a product of Red Hat,
Inc. and is not endorsed by Red Hat, Inc. This is a product of
Hewlett-Packard Company.
[root@n9]#
b. Determine the number of cores (processors) on the node:
15.8 Maintaining the SLURM Daemon Log 185