LSF Version 7.3 - Administering Platform LSF
Time-based Slot Reservation
410 Administering Platform LSF
Examples
lsb.queues The following queues are defined in lsb.queues:
Begin Queue
QUEUE_NAME = reservation
DESCRIPTION = For resource reservation
PRIORITY=40
RESOURCE_RESERVE = MAX_RESERVE_TIME[20]
End Queue
Assumptions Assume one host in the cluster with 10 CPUs and 1 GB of free memory currently
available.
Sequential jobs Each of the following sequential jobs requires 400 MB of memory and runs for 300
minutes.
Job 1:
bsub -W 300 -R "rusage[mem=400]" -q reservation myjob1
The job starts running, using 400M of memory and one job slot.
Job 2:
Submitting a second job with same requirements yields the same result.
Job 3:
Submitting a third job with same requirements reserves one job slot, and reserves
all free memory, if the amount of free memory is between 20 MB and 200 MB (some
free memory may be used by the operating system or other software.)
Time-based Slot Reservation
Existing LSF slot reservation works in simple environments, where host-based MXJ
limit is only constraint to job slot request. In complex environments, where more
than one constraints exist, for example job topology or generic slot limit:
◆ Estimated job start time becomes inaccurate
◆ The scheduler makes a reservation decision that can postpone estimated job
start time or decrease cluster utilization.
Current slot reservation by start time (RESERVE_BY_STARTTIME) resolves
several reservation issues in multiple candidate host groups, but it cannot help on
other cases:
◆ Special topology requests, like span[ptile=n]
◆ Only calculates and displays reservation if host has free slots. Reservations may
change or disappear if there are no free CPUs; for example, if a backfill job takes
all reserved CPUs.
◆ For HPC machines containing many internal nodes, host-level number of
reserved slots is not enough for administrator and end user to tell which CPUs
the job is reserving and waiting for.