Platform LSF Administration Guide Version 6.2

Chapter 30
Load Thresholds
Administering Platform LSF
477
Theory
The r15s, r1m, and r15m CPU run queue length conditions are compared to the
effective queue length as reported by
lsload -E, which is normalised for
multiprocessor hosts. Thresholds for these parameters should be set at appropriate
levels for single processor hosts.
Configure load thresholds consistently across queues. If a low priority queue has
higher suspension thresholds than a high priority queue, then jobs in the higher
priority queue will be suspended before jobs in the low priority queue.
Configuring load thresholds at host level
A shared resource cannot be used as a load threshold in the Hosts section of the
lsf.cluster.cluster_name file.
Configuring suspending conditions at queue level
The condition for suspending a job can be specified using the queue-level
STOP_COND parameter. It is defined by a resource requirement string. Only the
select section of the resource requirement string is considered when stopping a job.
All other sections are ignored.
This parameter provides similar but more flexible functionality for
loadStop.
If
loadStop thresholds have been specified, then a job will be suspended if either the
STOP_COND is TRUE or the
loadStop thresholds are exceeded.
Example
This queue will suspend a job based on the idle time for desktop machines and based on
availability of swap and memory on compute servers. Assume
cs is a Boolean resource
defined in the
lsf.shared file and configured in the lsf.cluster.cluster_name
file to indicate that a host is a compute server:
Begin Queue
.
STOP_COND= select[((!cs && it < 5) || (cs && mem < 15 && swap < 50))]
.
End Queue
Viewing host-level and queue-level suspending conditions
The suspending conditions are displayed by the bhosts -l and bqueues -l
commands.
Viewing job-level suspending conditions
The thresholds that apply to a particular job are the more restrictive of the host and
queue thresholds, and are displayed by the
bjobs -l command.
Viewing suspend reason
The bjobs -lp command shows the load threshold that caused LSF to suspend a job,
together with the scheduling parameters.
The use of STOP_COND affects the suspending reasons as displayed by the
bjobs
command. If STOP_COND is specified in the queue and the
loadStop thresholds are
not specified, the suspending reasons for each individual load index will not be
displayed.