Platform LSF Administration Guide Version 6.2
Default Job Control Actions
Administering Platform LSF
506
Default Job Control Actions
After a job is started, it can be killed, suspended, or resumed by the system, an LSF user,
or LSF administrator. LSF job control actions cause the status of a job to change. LSF
supports the following default actions for job controls:
◆
SUSPEND
◆
RESUME
◆
TERMINATE
On successful completion of the job control action, the LSF job control commands
cause the status of a job to change.
The environment variable LS_EXEC_T is set to the value JOB_CONTROLS for a job
when a job control action is initiated.
See “Killing Jobs” on page 149 for more information about job controls and the LSF
commands that perform them.
SUSPEND action
Change a running job from RUN state to one of the following states:
◆
USUSP or PSUSP in response to bstop
◆
SSUSP state when the LSF system suspends the job
The default action is to send the following signals to the job:
◆
SIGTSTP for parallel or interactive jobs
SIGTSTP is caught by the master process and passed to all the slave processes
running on other hosts.
◆
SIGSTOP for sequential jobs
SIGSTOP cannot be caught by user programs. The SIGSTOP signal can be
configured with the LSB_SIGSTOP parameter in
lsf.conf.
LSF invokes the SUSPEND action when:
◆
The user or LSF administrator issues a bstop or bkill command to the job
◆
Load conditions on the execution host satisfy any of:
❖
The suspend conditions of the queue, as specified by the STOP_COND
parameter in
lsb.queues
❖
The scheduling thresholds of the queue or the execution host
◆
The run window of the queue closes
◆
The job is preempted by a higher priority job
RESUME action
Change a suspended job from SSUSP, USUSP, or PSUSP state to the RUN state. The
default action is to send the signal SIGCONT.
LSF invokes the RESUME action when:
◆
The user or LSF administrator issues a bresume command to the job
◆
Load conditions on the execution host satisfy all of:
❖
The resume conditions of the queue, as specified by the RESUME_COND
parameter in
lsb.queues