Platform LSF Administration Guide Version 6.2
Chapter 34
Configuring Job Controls
Administering Platform LSF
509
JOB_CONTROLS=TERMINATE[brequeue]). This will cause a deadlock between
the signal and the action.
Using a command as a job control action
◆
The command line for the action is run with /bin/sh -c so you can use shell
features in the command.
◆
The command is run as the user of the job.
◆
All environment variables set for the job are also set for the command action.
The following additional environment variables are set:
❖
LSB_JOBPGIDS—a list of current process group IDs of the job
❖
LSB_JOBPIDS—a list of current process IDs of the job
◆
For the SUSPEND action command, the following environment variables are also
set:
❖
LSB_SUSP_REASONS—an integer representing a bitmap of suspending
reasons as defined in
lsbatch.h
The suspending reason can allow the command to take different actions based
on the reason for suspending the job.
❖
LSB_SUSP_SUBREASONS—an integer representing the load index that
caused the job to be suspended
When the suspending reason SUSP_LOAD_REASON (suspended by load) is
set in LSB_SUSP_REASONS, LSB_SUSP_SUBREASONS set to one of the
load index values defined in
lsf.h.
Use LSB_SUSP_REASONS and LSB_SUSP_SUBREASONS together in you
custom job control to determine the exact load threshold that caused a job to be
suspended.
◆
The standard input, output, and error of the command are redirected to the NULL
device, so you cannot tell directly whether the command runs correctly. The default
null device on UNIX is
/dev/null.
◆
You should make sure the command line is correct. If you want to see the output
from the command line for testing purposes, redirect the output to a file inside the
command line.
TERMINATE job actions
Use caution when configuring TERMINATE job actions that do more than just kill a
job. For example, resource usage limits that terminate jobs change the job state to
SSUSP while LSF waits for the job to end. If the job is not killed by the TERMINATE
action, it remains suspended indefinitely.
TERMINATE_WHEN parameter (lsb.queues)
In certain situations you may want to terminate the job instead of calling the default
SUSPEND action. For example, you may want to kill jobs if the run window of the
queue is closed. Use the TERMINATE_WHEN parameter to configure the queue to
invoke the TERMINATE action instead of SUSPEND.
See the Platform LSF Reference for information about the
lsb.queues file and the
TERMINATE_WHEN parameter.