HP-MPI User's Guide (11th Edition)

Understanding HP-MPI
CPU binding
Chapter 3 59
ll — least loaded (ll) Bind each rank to the CPU it is currently running
on.
For NUMA-based systems, the following options are also available:
ldom — Schedule ranks on ldoms according to packed rank id.
cyclic — Cyclic dist on each ldom according to packed rank id.
block — Block dist on each ldom according to packed rank id.
rr — round robin (rr) Same as cyclic, but consider ldom load average.
fill — Same as block, but consider ldom load average.
packed — Bind all ranks to same ldom as lowest rank.
slurm — slurm binding.
ll — least loaded (ll) Bind each rank to ldoms it is currently running on.
map_ldom — Schedule ranks on ldoms in cyclic distribution through MAP
variable.
To generate the current supported options:
% mpirun -cpu_bind=help ./a.out
Environment variables for CPU binding:
MPI_BIND_MAP allows specification of the integer CPU numbers,
ldom numbers, or CPU masks. These are a list of integers separated
by commas (,).
MPI_CPU_AFFINITY is an alternative method to using -cpu_bind on
the command line for specifying binding strategy. The possible
settings are LL, RANK, MAP_CPU, MASK_CPU, LDOM, CYCLIC,
BLOCK, RR, FILL, PACKED, SLURM, AND MAP_LDOM.
MPI_CPU_SPIN allows selection of spin value. The default is 2
seconds. This value is used to let processes busy spin such that the
operating system schedules processes to processors. The the
processes bind themselves to the appropriate processor, or core, or
ldom as appropriate.
For example, the following selects a 4 second spin period to allow 32
MPI ranks (processes) to settle into place and then bind to the
appropriate processor/core/ldom.
% mpirun -e MPI_CPU_SPIN=4 -cpu_bind -np\ 32 ./linpack