LSF Version 7.3 - Administering Platform LSF
Administering Platform LSF 499
C HAPTER
32
Running Parallel Jobs
Contents
◆ How LSF Runs Parallel Jobs on page 499
◆ Preparing Your Environment to Submit Parallel Jobs to LSF on page 500
◆ Submitting Parallel Jobs on page 500
◆ Starting Parallel Tasks with LSF Utilities on page 501
◆ Job Slot Limits For Parallel Jobs on page 502
◆ Specifying a Minimum and Maximum Number of Processors on page 502
◆ Specifying a First Execution Host on page 503
◆ Controlling Processor Allocation Across Hosts on page 504
◆ Running Parallel Processes on Homogeneous Hosts on page 507
◆ Limiting the Number of Processors Allocated on page 508
◆ Reserving Processors on page 511
◆ Reserving Memory for Pending Parallel Jobs on page 512
◆ Backfill Scheduling: Allowing Jobs to Use Reserved Job Slots on page 513
◆ Parallel Fairshare on page 522
◆ How Deadline Constraint Scheduling Works For Parallel Jobs on page 523
◆ Optimized Preemption of Parallel Jobs on page 523
How LSF Runs Parallel Jobs
When LSF runs a job, the LSB_HOSTS variable is set to the names of the hosts
running the batch job. For a parallel batch job, LSB_HOSTS contains the complete
list of hosts that LSF has allocated to that job.
LSF starts one controlling process for the parallel batch job on the first host in the
host list. It is up to your parallel application to read the LSB_HOSTS environment
variable to get the list of hosts, and start the parallel job components on all the other
allocated hosts.
LSF provides a generic interface to parallel programming packages so that any
parallel package can be supported by writing shell scripts or wrapper programs.