Platform LSF Reference Version 6.2
lsb.queues
Platform LSF Reference
411
◆
The standard input, output, and error of the command are redirected to the NULL
device, so you cannot tell directly whether the command runs correctly. The default
null device on UNIX is
/dev/null.
◆
The command is run as the user of the job.
◆
All environment variables set for the job are also set for the command action. The
following additional environment variables are set:
❖
LSB_JOBPGIDS — a list of current process group IDs of the job
❖
LSB_JOBPIDS —a list of current process IDs of the job
For the SUSPEND action command, the following environment variables are also
set:
❖
LSB_SUSP_REASONS—an integer representing a bitmap of suspending
reasons as defined in
lsbatch.h
The suspending reason can allow the command to take different actions based
on the reason for suspending the job.
❖
LSB_SUSP_SUBREASONS—an integer representing the load index that
caused the job to be suspended
When the suspending reason SUSP_LOAD_REASON (suspended by load) is
set in LSB_SUSP_REASONS, LSB_SUSP_SUBREASONS set to one of the
load index values defined in
lsf.h.
Use LSB_SUSP_REASONS and LSB_SUSP_SUBREASONS together in you
custom job control to determine the exact load threshold that caused a job to be
suspended.
◆
If an additional action is necessary for the SUSPEND command, that action should
also send the appropriate signal to the application. Otherwise, a job can continue to
run even after being suspended by LSF. For example,
JOB_CONTROLS=SUSPEND[bkill $LSB_JOBPIDS; command]
Default
On UNIX, by default, SUSPEND sends SIGTSTP for parallel or interactive jobs and
SIGSTOP for other jobs. RESUME sends SIGCONT. TERMINATE sends SIGINT,
SIGTERM and SIGKILL in that order.
On Windows, actions equivalent to the UNIX signals have been implemented to do the
default job control actions. Job control messages replace the SIGINT and SIGTERM
signals, but only customized applications will be able to process them. Termination is
implemented by the
TerminateProcess( ) system call.
JOB_IDLE
Syntax JOB_IDLE=number
Description
Specifies a threshold for idle job exception handling. The value should be a number
between 0.0 and 1.0 representing CPU time/runtime. If the job idle factor is less than
the specified threshold, LSF invokes
LSF_SERVERDIR/eadmin to trigger the action
for a job idle exception.
The minimum job run time before mbatchd reports that the job is idle is defined as
DETECT_IDLE_JOB_AFTER in lsb.params.
Valid Values
Any positive number between 0.0 and 1.0