Platform LSF Running Jobs Version 6.2
Chapter 1
About Platform LSF
Running Jobs with Platform LSF
13
Job slot
A job slot is a bucket into which a single unit of work is assigned in the LSF system.
Hosts are configured to have a number of job slots available and queues dispatch jobs
to fill job slots.
Commands
◆
bhosts—View job slot limits for hosts and host groups
◆
bqueues—View job slot limits for queues
◆
busers—View job slot limits for users and user groups
Configuration
◆
Define job slot limits in lsb.resources.
Job states
LSF jobs have the following states:
◆
PEND—Waiting in a queue for scheduling and dispatch
◆
RUN—Dispatched to a host and running
◆
DONE—Finished normally with zero exit value
◆
EXITED—Finished with non-zero exit value
◆
PSUSP—Suspended while pending
◆
USUSP—Suspended by user
◆
SSUSP—Suspended by the LSF system
◆
POST_DONE—Post-processing completed without errors
◆
POST_ERR—Post-processing completed with errors
◆
WAIT—Members of a chunk job that are waiting to run
Queue
A clusterwide container for jobs. All jobs wait in queues until they are scheduled and
dispatched to hosts.
Queues do not correspond to individual hosts; each queue can use all server hosts in the
cluster, or a configured subset of the server hosts.
When you submit a job to a queue, you do not need to specify an execution host. LSF
dispatches the job to the best available execution host in the cluster to run that job.
Queues implement different job scheduling and control policies.
Commands
◆
bqueues—View available queues
◆
bsub -q—Submit a job to a specific queue
◆
bparams—View default queues
Configuration
◆
Define queues in lsb.queues
The names of your queues should be unique. They should not be the same as the
cluster name or any host in the cluster.
First-come, first-served (FCFS) scheduling
The default type of scheduling in LSF. Jobs are considered for dispatch based on their
order in the queue.