Platform LSF Administration Guide Version 6.2
Suspending and Resuming Jobs
Administering Platform LSF
148
Suspending and Resuming Jobs
A job can be suspended by its owner or the LSF administrator. These jobs are
considered user-suspended and are displayed by
bjobs as USUSP.
If a user suspends a high priority job from a non-preemptive queue, the load may
become low enough for LSF to start a lower priority job in its place. The load created
by the low priority job can prevent the high priority job from resuming. This can be
avoided by configuring preemptive queues.
Suspending a job
Run bstop job_ID. Your job goes into USUSP state if the job is already started, or
into
PSUSP state if it is pending. For example:
%
bstop 3421
Job <3421> is being stopped
suspends job 3421.
UNIX
bstop
sends the following signals to the job:
◆
SIGTSTP for parallel or interactive jobs
SIGTSTP is caught by the master process and passed to all the slave processes
running on other hosts.
◆
SIGSTOP for sequential jobs
SIGSTOP cannot be caught by user programs. The SIGSTOP signal can be
configured with the LSB_SIGSTOP parameter in
lsf.conf.
Windows
bstop
causes the job to be suspended.
Resuming a job
Run bresume job_ID. For example:
%
bresume 3421
Job <3421> is being resumed
resumes job 3421.
Resuming a user-suspended job does not put your job into
RUN state immediately. If
your job was running before the suspension,
bresume first puts your job into SSUSP
state and then waits for
sbatchd to schedule it according to the load conditions.