Platform LSF Running Jobs Version 6.2
Reserving Resources for Jobs
Running Jobs with Platform LSF
46
Reserving Resources for Jobs
About resource reservation
When a job is dispatched, the system assumes that the resources that the job consumes
will be reflected in the load information. However, many jobs do not consume the
resources they require when they first start. Instead, they will typically use the resources
over a period of time.
For example, a job requiring 100 MB of swap is dispatched to a host having 150 MB of
available swap. The job starts off initially allocating 5 MB and gradually increases the
amount consumed to 100 MB over a period of 30 minutes. During this period, another
job requiring more than 50 MB of swap should not be started on the same host to avoid
over-committing the resource.
You can reserve resources to prevent overcommitment by LSF. Resource reservation
requirements can be specified as part of the resource requirements when submitting a
job, or can be configured into the queue level resource requirements.
Viewing host-level resource information
Use bhosts -l to view the amount of resources reserved on each host. Use bhosts
-s
to view information about shared resources.
Viewing queue-level resource information
To see the resource usage configured at the queue level, use bqueues -l.
How resource reservation works
When deciding whether to schedule a job on a host, LSF considers the reserved
resources of jobs that have previously started on that host. For each load index, the
amount reserved by all jobs on that host is summed up and subtracted (or added if the
index is increasing) from the current value of the resources as reported by the LIM to
get amount available for scheduling new jobs:
available amount = current value - reserved amount for all
jobs
Using the rusage string
To specify resource reservation at the job level, use bsub -R and include the resource
usage section in the resource requirement (
rusage) string.
For example:
% bsub -R "rusage[tmp=30:duration=30:decay=1]" myjob
will reserve 30 MB of temp space for the job. As the job runs, the amount reserved will
decrease at approximately 1 MB/minute such that the reserved amount is 0 after 30
minutes.