LSF Version 7.3 - Administering Platform LSF

ManualsBrandsHP ManualsSoftwareHP XC System 4.x Software

401

402

403

404

405

406

407

408

409

410

Time-based Slot Reservation

410 Administering Platform LSF

Examples

lsb.queues The following queues are defined in lsb.queues:

Begin Queue

QUEUE_NAME = reservation

DESCRIPTION = For resource reservation

PRIORITY=40

RESOURCE_RESERVE = MAX_RESERVE_TIME[20]

End Queue

Assumptions Assume one host in the cluster with 10 CPUs and 1 GB of free memory currently

available.

Sequential jobs Each of the following sequential jobs requires 400 MB of memory and runs for 300

minutes.

Job 1:

bsub -W 300 -R "rusage[mem=400]" -q reservation myjob1

The job starts running, using 400M of memory and one job slot.

Job 2:

Submitting a second job with same requirements yields the same result.

Job 3:

Submitting a third job with same requirements reserves one job slot, and reserves

all free memory, if the amount of free memory is between 20 MB and 200 MB (some

free memory may be used by the operating system or other software.)

Time-based Slot Reservation

Existing LSF slot reservation works in simple environments, where host-based MXJ

limit is only constraint to job slot request. In complex environments, where more

than one constraints exist, for example job topology or generic slot limit:

◆ Estimated job start time becomes inaccurate

◆ The scheduler makes a reservation decision that can postpone estimated job

start time or decrease cluster utilization.

Current slot reservation by start time (RESERVE_BY_STARTTIME) resolves

several reservation issues in multiple candidate host groups, but it cannot help on

other cases:

◆ Special topology requests, like span[ptile=n]

◆ Only calculates and displays reservation if host has free slots. Reservations may

change or disappear if there are no free CPUs; for example, if a backfill job takes

all reserved CPUs.

◆ For HPC machines containing many internal nodes, host-level number of

reserved slots is not enough for administrator and end user to tell which CPUs

the job is reserving and waiting for.