Platform LSF Administration Guide Version 6.2
Controlling Processor Allocation Across Hosts
Administering Platform LSF
436
Controlling Processor Allocation Across Hosts
Sometimes you need to control how the selected processors for a parallel job are
distributed across the hosts in the cluster.
You can control this at the job level or at the queue level. The queue specification is
ignored if your job specifies its own locality.
Specifying parallel job locality at the job level
By default, LSF will allocate the required processors for the job from the available set of
processors.
A parallel job may span multiple hosts, with a specifiable number of processes allocated
to each host. A job may be scheduled on to a single multiprocessor host to take
advantage of its efficient shared memory, or spread out on to multiple hosts to take
advantage of their aggregate memory and swap space. Flexible spanning may also be
used to achieve parallel I/O.
You are able to specify “select all the processors for this parallel batch job on the same
host”, or “do not choose more than
n processors on one host” by using the span
section in the resource requirement string (
bsub -R or RES_REQ in the queue
definition in
lsb.queues).
If
PARALLEL_SCHED_BY_SLOT=Y in lsb.params, the span string is used to control
the number of job slots instead of processors.
Syntax
Two kinds of span string are supported:
◆
span[hosts=1]
Indicates that all the processors allocated to this job must be on the same host.
◆
span[ptile=value]
Indicates the number of processors (value) on each host that should be allocated to
the job.
where value is:
❖
Default ptile value, specified by n processors. For example:
span[ptile=4]
LSF allocates 4 processors on each available host, regardless of how many
processors the host has.
❖
Predefined ptile value, specified by ’!’. For example:
span[ptile='!']
uses the predefined maximum job slot limit in lsb.hosts (MXJ per host
type/model) as its value.
If the host or host type/model does not define MXJ, the default predefined
ptile value is 1.
❖
Predefined ptile value with optional multiple ptile values, per host type or
host model.