LSF Version 7.3 - Platform LSF Configuration Reference
SUSPEND_CONTROL
Syntax
SUSPEND_CONTROL=signal | command | CHKPNT
Remember:
Unlike the JOB_CONTROLS parameter in lsb.queues, the
SUSPEND_CONTROL parameter does not require square
brackets ([ ]) around the action.
•
signal is a UNIX signal name (for example, SIGTSTP). The specified signal is sent to the
job. The same set of signals is not supported on all UNIX systems. To display a list of the
symbolic names of the signals (without the SIG prefix) supported on your system, use the
kill -l command.
•
command specifies a /bin/sh command line to be invoked.
•
Do not quote the command line inside an action definition.
•
Do not specify a signal followed by an action that triggers the same signal. For example,
do not specify SUSPEND_CONTROL=bstop. This causes a deadlock between the signal
and the action.
•
CHKPNT is a special action, which causes the system to checkpoint the job. The job is
checkpointed and then stopped by sending the SIGSTOP signal to the job automatically.
Description
Changes the behavior of the SUSPEND action in LSF.
•
The contents of the configuration line for the action are run with /bin/sh -c so you can
use shell features in the command.
•
The standard input, output, and error of the command are redirected to the NULL device,
so you cannot tell directly whether the command runs correctly. The default null device
on UNIX is /dev/null.
•
The command is run as the user of the job.
•
All environment variables set for the job are also set for the command action. The following
additional environment variables are set:
•
LSB_JOBPGIDS — a list of current process group IDs of the job
•
LSB_JOBPIDS —a list of current process IDs of the job
•
LSB_SUSP_REASONS — an integer representing a bitmap of suspending reasons as
defined in lsbatch.h The suspending reason can allow the command to take different
actions based on the reason for suspending the job.
•
LSB_SUSP_SUBREASONS — an integer representing the load index that caused the
job to be suspended
When the suspending reason SUSP_LOAD_REASON (suspended by load) is set in
LSB_SUSP_REASONS, LSB_SUSP_SUBREASONS is set to one of the load index values
defined in lsf.h.
Use LSB_SUSP_REASONS and LSB_SUSP_SUBREASONS together in your custom job
control to determine the exact load threshold that caused a job to be suspended.
•
If an additional action is necessary for the SUSPEND command, that action should also
send the appropriate signal to the application. Otherwise, a job can continue to run even
lsb.applications
184 Platform LSF Configuration Reference