Platform LSF Administration Guide Version 6.2

Chapter 6
Managing Jobs
Administering Platform LSF
141
Dispatch windows during which the queue can dispatch and qualified hosts can
accept jobs
Run windows during which jobs from the queue can run
Limits on the number of job slots configured for a queue, a host, or a user
Relative priority to other users and jobs
Availability of the specified resources
Job dependency and pre-execution conditions
Maximum
pending job
threshold
If the user or user group submitting the job has reached the pending job threshold as
specified by
MAX_PEND_JOBS (either in the User section of lsb.users, or cluster-
wide in
lsb.params), LSF will reject any further job submission requests sent by that
user or user group. The system will continue to send the job submission requests with
the interval specified by
SUB_TRY_INTERVAL in lsb.params until it has made a
number of attempts equal to the
LSB_NTRIES environment variable. If LSB_NTRIES
is undefined and LSF rejects the job submission request, the system will continue to
send the job submission requests indefinitely as the default behavior.
Viewing pending
job information
Use the bjobs -p command to display the reason why a job is pending. Use the
busers -w all to see the maximum pending job threshold for all users.
Suspended jobs
A job can be suspended at any time. A job can be suspended by its owner, by the LSF
administrator, by the root user (superuser), or by LSF.
After a job has been dispatched and started on a host, it can be suspended by LSF. When
a job is running, LSF periodically checks the load level on the execution host. If any load
index is beyond either its per-host or its per-queue suspending conditions, the lowest
priority batch job on that host is suspended.
If the load on the execution host or hosts becomes too high, batch jobs could be
interfering among themselves or could be interfering with interactive jobs. In either case,
some jobs should be suspended to maximize host performance or to guarantee
interactive response time.
LSF suspends jobs according to the priority of the job’s queue. When a host is busy, LSF
suspends lower priority jobs first unless the scheduling policy associated with the job
dictates otherwise.
Jobs are also suspended by the system if the job queue has a run window and the current
time goes outside the run window.
A system-suspended job can later be resumed by LSF if the load condition on the
execution hosts falls low enough or when the closed run window of the queue opens
again.
Viewing suspension reasons
Use the bjobs -s command to display the reason why a job was suspended.