HP XC System Software Administration Guide Version 3.1
mmand <srun sleep 50>
date time: Submitted from host <lsfhost.localdomain>, CWD <$HOME>, Ou
tput File <./>, 8 Processors Requested;
date time: Started on 8 Hosts/Processors <8*lsfhost.localdomain>, Exe
cution Home <hptc_cluster/hptc_cluster>, Execution CWD <hptc/hptc;
_cluster/lsf/home>;
date time: slurm_id=7;ncpus=8;slurm_alloc=n[1-4];
SCHEDULING PARAMETERS:
r15s r1m r15m ut pg io ls it tmp swp mem
loadSched - - - - - - - - - - -
loadStop - - - - - - - - - - -
Note the output that identifies the SLURM_JOBID and the SLURM allocation:
date time: slurm_id=7;ncpus=8;slurm_alloc=n[1-4];
You can use the SLURM_JOBID with various SLURM commands, for example, use the squeue command
to view information about jobs in the SLURM scheduling queue and use the scontrol show command
to display the state of the job.
$ squeue -j 7
JOBID PARTITION NAME USER ST TIME NODES NODELIST
7 lsf hptclsf@ lsfadmin R 0:14 4 n[1-4]
$ scontrol show job 7
JobId=7 UserId=lsfadmin(502) GroupId=lsfadmin(503)
Name=LSFclustername@LSF_JOBID JobState=RUNNING
Priority=4294901755 Partition=lsf BatchFlag=0
AllocNode:Sid=n16:27450 TimeLimit=UNLIMITED
StartTime=10/11-17:54:05 EndTime=NONE
NodeList=n[1-4] NodeListIndecies=0,3,-1
ReqProcs=0 MinNodes=0 Shared=0 Contiguous=0
MinProcs=0 MinMemory=0 Features=(null) MinTmpDisk=0
ReqNodeList=(null) ReqNodeListIndecies=-1
ExcNodeList=(null) ExcNodeListIndecies=-1
The NAME= output of the scontrol show command returns the name of the LSF cluster (the installation
default is hptclsf) and the LSF-HPC with SLURM job number, separated by the at character (@).
The bhist command reports the history of a job.
After you have gathered information about a job, you can use other useful LSF commands to control
LSF-HPC with SLURM jobs: bkill, bstop, and bresume.
The bkill command kills a running job. This command uses the SLURM scancel command.
The bstop command suspends the execution of a running job.
The bresume command resumes the execution of a suspended job.
For more information, see bkill(1), bstop(1), and bresume(1).
15.9 Job Accounting
Standard LSF job accounting using the bacct command is available. The output of a job contains total
CPU time and memory usage:
$ cat 231.out
.
.
186 Managing LSF