Platform LSF Administration Guide Version 6.2
Chapter 30
Load Thresholds
Administering Platform LSF
477
Theory
◆
The r15s, r1m, and r15m CPU run queue length conditions are compared to the
effective queue length as reported by
lsload -E, which is normalised for
multiprocessor hosts. Thresholds for these parameters should be set at appropriate
levels for single processor hosts.
◆
Configure load thresholds consistently across queues. If a low priority queue has
higher suspension thresholds than a high priority queue, then jobs in the higher
priority queue will be suspended before jobs in the low priority queue.
Configuring load thresholds at host level
A shared resource cannot be used as a load threshold in the Hosts section of the
lsf.cluster.cluster_name file.
Configuring suspending conditions at queue level
The condition for suspending a job can be specified using the queue-level
STOP_COND parameter. It is defined by a resource requirement string. Only the
select section of the resource requirement string is considered when stopping a job.
All other sections are ignored.
This parameter provides similar but more flexible functionality for
loadStop.
If
loadStop thresholds have been specified, then a job will be suspended if either the
STOP_COND is TRUE or the
loadStop thresholds are exceeded.
Example
This queue will suspend a job based on the idle time for desktop machines and based on
availability of swap and memory on compute servers. Assume
cs is a Boolean resource
defined in the
lsf.shared file and configured in the lsf.cluster.cluster_name
file to indicate that a host is a compute server:
Begin Queue
.
STOP_COND= select[((!cs && it < 5) || (cs && mem < 15 && swap < 50))]
.
End Queue
Viewing host-level and queue-level suspending conditions
The suspending conditions are displayed by the bhosts -l and bqueues -l
commands.
Viewing job-level suspending conditions
The thresholds that apply to a particular job are the more restrictive of the host and
queue thresholds, and are displayed by the
bjobs -l command.
Viewing suspend reason
The bjobs -lp command shows the load threshold that caused LSF to suspend a job,
together with the scheduling parameters.
The use of STOP_COND affects the suspending reasons as displayed by the
bjobs
command. If STOP_COND is specified in the queue and the
loadStop thresholds are
not specified, the suspending reasons for each individual load index will not be
displayed.