Platform LSF Administration Guide Version 6.2

Allowing Jobs to Use Reserved Job Slots
Administering Platform LSF
452
Using interruptible backfill
Interruptible backfill scheduling can improve cluster utilization by allowing reserved
job slots to be used by low priority small jobs that will be terminated when the higher
priority large jobs are about to start.
An interruptible backfill job:
Starts as a regular job and is killed when it exceeds the queue runtime limit
OR
Is started for backfill whenever there is a backfill time slice longer than the specified
minimal time, and killed before the slot-reservation job is about to start
Applies to compute-intensive serial or single-node parallel jobs that can run a long time,
yet be able to checkpoint or resume from an arbitrary computation point.
Resource
allocation diagram
Job life cycle
1
Jobs are submitted to a queue configured for interruptible backfill. The job runtime
requirement is ignored.
2
Job is scheduled as either regular job or backfill job.
3
The queue runtime limit is applied to the regularly scheduled job.
4
In backfill phase, the job is considered for run on any reserved resource, which
duration is longer than the minimal time slice configured for the queue. The job
runtime limit is set in such way, that the job will release the resource before it is
needed by the slot reserving job.
5
The job runs in a regular manner. It is killed upon reaching its runtime limit, and
requeued for the next run. Requeueing must be explicitly configured in the queue.
Assumptions and
limitations
The interruptible backfill job will hold the slot-reserving job start until its calculated
start time, in the same way as a regular backfill job. The interruptible backfill job will
not be preempted in any way other than being killed when its run limit expires.
Interruptible backfill job started in regular scheduling,
Will be terminated on queue RUNLIMIT expiry.
Start-time deciding regular backfill job,
will finish at To, as promised by its
job specification
Slot-reserving job,
is calculated to start at To
Interruptible backfill job, will be terminated
so that slot-reserving job could start at To
Regular backfill job
will finish before To
To
Machine capacity (slots)
Elapsed time (sec)