Platform LSF Administration Guide Version 6.2

ManualsBrandsHP ManualsSoftwareHP XC System 3.x Software

291

292

293

294

295

296

297

298

299

300

Chapter 16

Fairshare Scheduling

Administering Platform LSF

291

Queue-based Fairshare

When a priority is set in a queue configuration, a high priority queue tries to dispatch as

many jobs as it can before allowing lower priority queues to dispatch any job. Lower

priority queues are blocked until the higher priority queue cannot dispatch any more

jobs. However, it may be desirable to give some preference to lower priority queues and

regulate the flow of jobs from the queue.

Queue-based fairshare allows flexible slot allocation per queue as an alternative to

absolute queue priorities by enforcing a soft job slot limit on a queue. This allows you

to organize the priorities of your work and tune the number of jobs dispatched from a

queue so that no single queue monopolizes cluster resources, leaving other queues

waiting to dispatch jobs.

You can balance the distribution of job slots among queues by configuring a ratio of jobs

waiting to be dispatched from each queue. LSF then attempts to dispatch a certain

percentage of jobs from each queue, and does not attempt to drain the highest priority

queue entirely first.

When queues compete, the allocated slots per queue are kept within the limits of the

configured share. If only one queue in the pool has jobs, that queue can use all the

available resources and can span its usage across all hosts it could potentially run jobs on.

Managing pools of queues

You can configure your queues into a pool, which is a named group of queues using the

same set of hosts. A pool is entitled to a slice of the available job slots. You can configure

as many pools as you need, but each pool must use the same set of hosts. There can be

queues in the cluster that do not belong to any pool yet share some hosts used by a pool.

How LSF allocates slots for a pool of queues

During job scheduling, LSF orders the queues within each pool based on the shares the

queues are entitled to. The number of running jobs (or job slots in use) is maintained at

the percentage level specified for the queue. When a queue has no pending jobs, leftover

slots are redistributed to other queues in the pool with jobs pending.

The total number of slots in each pool is constant; it is equal to the number of slots in

use plus the number of free slots to the maximum job slot limit configured either in

lsb.hosts (MXJ) or in lsb.resources. The accumulation of slots in use by the

queue is used in ordering the queues for dispatch.

Job limits and host limits are enforced by the scheduler. For example, if LSF determines

that a queue is eligible to run 50 jobs, but the queue has a job limit of 40 jobs, no more

than 40 jobs will run. The remaining 10 job slots are redistributed among other queues

belonging to the same pool, or make them available to other queues that are configured

to use them.

Accumulated slots

in use

As queues run the jobs allocated to them, LSF accumulates the slots each queue has used

and decays this value over time, so that each queue is not allocated more slots than it

deserves, and other queues in the pool have a chance to run their share of jobs.