LSF Version 7.3 - Administering Platform LSF
Backfill Scheduling: Allowing Jobs to Use Reserved Job Slots
518 Administering Platform LSF
Using interruptible backfill
Interruptible backfill scheduling can improve cluster utilization by allowing
reserved job slots to be used by low priority small jobs that are terminated when the
higher priority large jobs are about to start.
An interruptible backfill job:
◆ Starts as a regular job and is killed when it exceeds the queue runtime limit, or
◆ Is started for backfill whenever there is a backfill time slice longer than the
specified minimal time, and killed before the slot-reservation job is about to
start. This applies to compute-intensive serial or single-node parallel jobs that
can run a long time, yet be able to checkpoint or resume from an arbitrary
computation point.
Resource allocation diagram
Job life cycle 1 Jobs are submitted to a queue configured for interruptible backfill. The job
runtime requirement is ignored.
2 Job is scheduled as either regular job or backfill job.
3 The queue runtime limit is applied to the regularly scheduled job.
4 In backfill phase, the job is considered for run on any reserved resource, which
duration is longer than the minimal time slice configured for the queue. The
job runtime limit is set in such way, that the job releases the resource before it
is needed by the slot reserving job.
5 The job runs in a regular manner. It is killed upon reaching its runtime limit,
and requeued for the next run. Requeueing must be explicitly configured in the
queue.