HP XC System Software Administration Guide Version 3.0

Example 13-7. Basic Job Launch with the JOB_STARTER Script Configured Basic Job Launch with the
JOB_STARTER Script Configured
$ bsub -I hostname
Job <24> is submitted to default queue <normal>.
<<Waiting for dispatch...>>
<<starting on lsfhost.localdomain>>
n99
Monitoring and Controlling LSF-HPC Jobs
All the standard LSF commands for monitoring a job are supported. The bjobs command reports the status
of a job. The following is an example of the bjobs command:
$ bjobs
JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME
116 lsfadmi RUN normal lsfhost.loc 8*lsfhost.l * sleep 50 date time
You can use the -l (long) option to obtain detailed information about a job, as shown in this example:
$ bjobs -l 116
Job <116>, User <lsfadmin>, Project default, Status <RUN>, Queue <normal>, Co
mmand <srun sleep 50>
date time: Submitted from host <lsfhost.localdomain>, CWD <$HOME>, Ou
tput File <./>, 8 Processors Requested;
date time: Started on 8 Hosts/Processors <8*lsfhost.localdomain>, Exe
cution Home <hptc_cluster/hptc_cluster>, Execution CWD <hptc/hptc;
_cluster/lsf/home>;
date time: slurm_id=7;ncpus=8;slurm_alloc=n[1-4];
SCHEDULING PARAMETERS:
r15s r1m r15m ut pg io ls it tmp swp mem
loadSched - - - - - - - - - - -
loadStop - - - - - - - - - - -
Note the output that identifies the SLURM_JOBID and the SLURM allocation:
date time: slurm_id=7;ncpus=8;slurm_alloc=n[1-4];
You can use the SLURM_JOBID with various SLURM commands, for example, use the squeue command
to view information about jobs in the SLURM scheduling queue and use the scontrol show command to
display the state of the job.
$ squeue -j 7
JOBID PARTITION NAME USER ST TIME NODES NODELIST
7 lsf hptclsf@ lsfadmin R 0:14 4 n[1-4]
$ scontrol show job 7
JobId=7 UserId=lsfadmin(502) GroupId=lsfadmin(503)
Name=LSFclustername@LSF_JOBID JobState=RUNNING
Priority=4294901755 Partition=lsf BatchFlag=0
AllocNode:Sid=n16:27450 TimeLimit=UNLIMITED
StartTime=10/11-17:54:05 EndTime=NONE
NodeList=n[1-4] NodeListIndecies=0,3,-1
ReqProcs=0 MinNodes=0 Shared=0 Contiguous=0
MinProcs=0 MinMemory=0 Features=(null) MinTmpDisk=0
ReqNodeList=(null) ReqNodeListIndecies=-1
ExcNodeList=(null) ExcNodeListIndecies=-1
The NAME= output of the scontrol show command returns the name of the LSF cluster (the installation
default is hptclsf) and the LSF-HPC job number, separated by the at character (@).
The bhist command reports the history of a job.
After you have gathered information about a job, you can use other useful LSF commands to control LSF-HPC
jobs: bkill, bstop, and bresume.
The bkill command kills a running job. This command uses the SLURM scancel command.
126 Managing LSF