Platform LSF Administration Guide Version 6.2
About Resource Reservation
Administering Platform LSF
342
About Resource Reservation
When a job is dispatched, the system assumes that the resources that the job consumes
will be reflected in the load information. However, many jobs do not consume the
resources they require when they first start. Instead, they will typically use the resources
over a period of time.
For example, a job requiring 100 MB of swap is dispatched to a host having 150 MB of
available swap. The job starts off initially allocating 5 MB and gradually increases the
amount consumed to 100 MB over a period of 30 minutes. During this period, another
job requiring more than 50 MB of swap should not be started on the same host to avoid
over-committing the resource.
Resources can be reserved to prevent overcommitment by LSF. Resource reservation
requirements can be specified as part of the resource requirements when submitting a
job, or can be configured into the queue level resource requirements.
How resource reservation works
When deciding whether to schedule a job on a host, LSF considers the reserved
resources of jobs that have previously started on that host. For each load index, the
amount reserved by all jobs on that host is summed up and subtracted (or added if the
index is increasing) from the current value of the resources as reported by the LIM to
get amount available for scheduling new jobs:
available amount = current value - reserved amount for all
jobs
For example:
% bsub -R "rusage[tmp=30:duration=30:decay=1]" myjob
will reserve 30 MB of temp space for the job. As the job runs, the amount reserved will
decrease at approximately 1 MB/minute such that the reserved amount is 0 after 30
minutes.
Queue-level and job-level resource reservation
The queue level resource requirement parameter RES_REQ may also specify the
resource reservation. If a queue reserves certain amount of a resource, you cannot
reserve a greater amount of that resource at the job level.
For example, if the output of
bqueues -l command contains:
RES_REQ: rusage[mem=40:swp=80:tmp=100]
the following submission will be rejected since the requested amount of certain
resources exceeds queue's specification:
% bsub -R "rusage[mem=50:swp=100]" myjob