LSF Version 7.3 - Platform LSF Configuration Reference

JOB_CONTROLS=TERMINATE[brequeue]. This causes a
deadlock between the signal and the action.
CHKPNT is a special action, which causes the system to checkpoint the job. Only valid for
SUSPEND and TERMINATE actions:
If the SUSPEND action is CHKPNT, the job is checkpointed and then stopped by
sending the SIGSTOP signal to the job automatically.
If the TERMINATE action is CHKPNT, then the job is checkpointed and killed
automatically.
Description
Changes the behavior of the SUSPEND, RESUME, and TERMINATE actions in LSF.
The contents of the configuration line for the action are run with /bin/sh -c so you can
use shell features in the command.
The standard input, output, and error of the command are redirected to the NULL device,
so you cannot tell directly whether the command runs correctly. The default null device
on UNIX is /dev/null.
The command is run as the user of the job.
All environment variables set for the job are also set for the command action. The following
additional environment variables are set:
LSB_JOBPGIDS — a list of current process group IDs of the job
LSB_JOBPIDS —a list of current process IDs of the job
For the SUSPEND action command, the following environment variables are also set:
LSB_SUSP_REASONS — an integer representing a bitmap of suspending reasons as
defined in lsbatch.h. The suspending reason can allow the command to take
different actions based on the reason for suspending the job.
LSB_SUSP_SUBREASONS — an integer representing the load index that caused the
job to be suspended. When the suspending reason SUSP_LOAD_REASON (suspended
by load) is set in LSB_SUSP_REASONS, LSB_SUSP_SUBREASONS set to one of the
load index values defined in lsf.h. Use LSB_SUSP_REASONS and
LSB_SUSP_SUBREASONS together in your custom job control to determine the exact
load threshold that caused a job to be suspended.
If an additional action is necessary for the SUSPEND command, that action should also
send the appropriate signal to the application. Otherwise, a job can continue to run even
after being suspended by LSF. For example, JOB_CONTROLS=SUSPEND[kill
$LSB_JOBPIDS; command]
Default
On UNIX, by default, SUSPEND sends SIGTSTP for parallel or interactive jobs and SIGSTOP
for other jobs. RESUME sends SIGCONT. TERMINATE sends SIGINT, SIGTERM and
SIGKILL in that order.
On Windows, actions equivalent to the UNIX signals have been implemented to do the default
job control actions. Job control messages replace the SIGINT and SIGTERM signals, but only
customized applications are able to process them. Termination is implemented by the
TerminateProcess( ) system call.
lsb.queues
Platform LSF Configuration Reference 297