Platform LSF Administration Guide Version 6.2
Chapter 16
Fairshare Scheduling
Administering Platform LSF
291
Queue-based Fairshare
When a priority is set in a queue configuration, a high priority queue tries to dispatch as
many jobs as it can before allowing lower priority queues to dispatch any job. Lower
priority queues are blocked until the higher priority queue cannot dispatch any more
jobs. However, it may be desirable to give some preference to lower priority queues and
regulate the flow of jobs from the queue.
Queue-based fairshare allows flexible slot allocation per queue as an alternative to
absolute queue priorities by enforcing a soft job slot limit on a queue. This allows you
to organize the priorities of your work and tune the number of jobs dispatched from a
queue so that no single queue monopolizes cluster resources, leaving other queues
waiting to dispatch jobs.
You can balance the distribution of job slots among queues by configuring a ratio of jobs
waiting to be dispatched from each queue. LSF then attempts to dispatch a certain
percentage of jobs from each queue, and does not attempt to drain the highest priority
queue entirely first.
When queues compete, the allocated slots per queue are kept within the limits of the
configured share. If only one queue in the pool has jobs, that queue can use all the
available resources and can span its usage across all hosts it could potentially run jobs on.
Managing pools of queues
You can configure your queues into a pool, which is a named group of queues using the
same set of hosts. A pool is entitled to a slice of the available job slots. You can configure
as many pools as you need, but each pool must use the same set of hosts. There can be
queues in the cluster that do not belong to any pool yet share some hosts used by a pool.
How LSF allocates slots for a pool of queues
During job scheduling, LSF orders the queues within each pool based on the shares the
queues are entitled to. The number of running jobs (or job slots in use) is maintained at
the percentage level specified for the queue. When a queue has no pending jobs, leftover
slots are redistributed to other queues in the pool with jobs pending.
The total number of slots in each pool is constant; it is equal to the number of slots in
use plus the number of free slots to the maximum job slot limit configured either in
lsb.hosts (MXJ) or in lsb.resources. The accumulation of slots in use by the
queue is used in ordering the queues for dispatch.
Job limits and host limits are enforced by the scheduler. For example, if LSF determines
that a queue is eligible to run 50 jobs, but the queue has a job limit of 40 jobs, no more
than 40 jobs will run. The remaining 10 job slots are redistributed among other queues
belonging to the same pool, or make them available to other queues that are configured
to use them.
Accumulated slots
in use
As queues run the jobs allocated to them, LSF accumulates the slots each queue has used
and decays this value over time, so that each queue is not allocated more slots than it
deserves, and other queues in the pool have a chance to run their share of jobs.