LSF Version 7.3 - Administering Platform LSF

ManualsBrandsHP ManualsSoftwareHP XC System 4.x Software

411

412

413

414

415

416

417

418

419

420

Time-based Slot Reservation

416 Administering Platform LSF

Reservation scenarios

Scenario 1 Even though no running jobs finish and no host status in cluster are changed, a job’s

future allocation may still change from time to time.

Why this happens Each scheduling cycle, the scheduler recalculates a job’s reservation information,

estimated start time and opportunity for future allocation. The job candidate host

list may be reordered according to current load. This reordered candidate host list

will be used for the entire scheduling cycle, also including job future allocation

calculation. So different order of candidate hosts may lead to different result of job

future allocation. However, the job estimated start time should be the same.

For example, there are two hosts in cluster,

hostA and hostB. 4 CPUs per host. Job

1 is running and occupying 2 CPUs on

hostA and 2 CPUs on hostB. Job 2 requests

6 CPUs. If the order of hosts is

hostA and hostB, then the future allocation of job 2

will be 4 CPUs on

hostA 2 CPUs on hostB. If the order of hosts changes in the next

scheduling cycle changes to

hostB and hostA, then the future allocation of job 2

will be 4 CPUs on

hostB 2 CPUs on hostA.

Scenario 2: If you set JOB_ACCEPT_INTERVAL to non-zero value, after job is dispatched,

within JOB_ACCEPT_INTERVAL period, pending job estimated start time and

future allocation may momentarily fluctuate.

Why this happens The scheduler does a time-based reservation calculation each cycle. If

JOB_ACCEPT_INTERVAL is set to non-zero value. once a new job has been

dispatched to a host, this host will not accept new job within

JOB_ACCEPT_INTERVAL interval. Because the host will not be considered for

the entire scheduling cycle, no time-based reservation calculation is done, which

may result in slight change in job estimated start time and future allocation

information. After JOB_ACCEPT_INTERVAL has passed, host will become

available for time-based reservation calculation again, and the pending job

estimated start time and future allocation will be accurate again.

Examples

Example 1 Three hosts, 4 CPUs each: qat24, qat25, and qat26. Job 11895 uses 4 slots on qat24

(10 hours). Job 11896 uses 4 slots on

qat25 (12 hours), and job 11897 uses 2 slots

qat26 (9 hours).

Job 11898 is submitted and requests

-n 6 -R "span[ptile=2]".

bjobs -l 11898

Job <11898>, User <user2>, Project <default>, Status <PEND>, Queue <challenge>,

Job Priority <50>, Command <sleep 100000000>

RUNLIMIT

840.0 min of hostA

Fri Apr 22 15:18:56: Reserved <2> job slots on host(s) <2*qat26>;

Sat Apr 23 03:28:46: Estimated Job Start Time;

alloc=2*qat25 2*qat24 2*qat26.lsf.platform.com

Example 2 Two RMS hosts, sierraA and sierraB, 8 CPUs per host. Job 3873 uses 4*sierra0

and will last for 10 hours. Job 3874 uses 4*sierra1 and will run for 12 hours. Job 3875

uses 2*sierra2 and 2*sierra3, and will run for 13 hours.