HP XC System Software Administration Guide Version 2.1
You can use the -l (long) optio n to obtain detaile
d information about a job, as shown in this
example.
$ bjobs -l 116
Job <116>, User <lsfadmin>, Project default, Status <RUN>, Queue <normal>, Co
mmand <srun sleep 50>
date time: Submitted from host <lsfhost.localdomain>, CWD <$HOME>, Ou
tput File <./>, 8 Processors Requested;
date time: Started on 8 Hosts/Processors <8*lsfhost.localdomain>, Exe
cution Home <hptc_cluster/hptc_cluster>, Execution CWD <hptc/hptc;
_cluster/lsf/home>;
date time: slurm_id=7;ncpus=8;slurm_alloc=n[1-4];
SCHEDULING PARAMETERS:
r15s r1m r15m ut pg io ls it tmp swp mem
loadSched ---- -------
loadStop ---- -------
Note the output th at ident ifi es th e SLURM_JOBID and the SLURM allocation:
date time: slurm_id=7;ncpus=8;slurm_alloc
=n[1-4];
You can use the SLURM_JOBID with various SLURM commands, particularly the squeue
command to view inform atio n about jobs in the SLURM scheduling queue and the scontrol
show com mand to display the state of the job .
$ squeue -j 7
JOBID PARTITION NAME USER ST TIME NODES NODELIST
7 lsf hptclsf@ lsfadmin R 0:14 4 n[1-4]
$ scontrol show job 7
JobId=7 UserId=lsfadmin(502) GroupId=lsfadmin(503)
Name=LSFclustername@LSF_JOBID JobState=RUNNING
Priority=4294901755 Partition=lsf BatchFlag=0
AllocNode:Sid=n16:27450 TimeLimit=UNLIMITED
StartTime=10/11-17:54:05 EndTime=NONE
NodeList=n[1-4] NodeListIndecies=0,3,-1
ReqProcs=0 MinNodes=0 Shared=0 Contiguous=0
MinProcs=0 MinMemory=0 Features=(null) MinTmpDisk=0
ReqNodeList=(null) ReqNodeListIndecies=-1
ExcNodeList=(null) ExcNodeListIndecies=-1
The NAME= output of the scontrol show command returns th
e name of the LSF cluster
(the installation default is hptclsf)andtheLSF-HPC
job number, separated by the at
character (@).
The bhist command reports the history of a job.
Once you have gath ered i nfo rm ation ab out a j ob, you can use other useful LSF commands to
control LSF-HPC jobs; these commands are bkill, bstop,andbresume.
The bkill command k ills a running job. This com mand
uses the SLURM scancel
command.
The bstop command s usp ends the execution of a run nin g job.
The bresume comm and resumes the execution of a suspended job.
See the bkill
(1), bstop(1),andbresume(1) manp
ages for mo re information.
12.8 Job Accounting
Standard LSF job accounting using the bacct command is available, but the level of
granularity for reporting job resource usage is expressed in minutes.
Per processor accuracy in reporting usage is
not available in this release.
LSF-HPC Administration 12-9