HP-MPI User's Guide (11th Edition)

Understanding HP-MPI
CPU binding
Chapter 358
CPU binding
The mpirun option -cpu_bind binds a rank to an ldom to prevent a
process from moving to a different ldom after startup. The binding occurs
before the MPI application is executed.
To accomplish this, a shared library is loaded at startup which does the
following for each rank:
Spins for a short time in a tight loop to let the operating system
distribute processes to CPUs evenly. This duration can be changed by
setting the MPI_CPU_SPIN environment variable which controls the
number of spins in the initial loop. Default is 3 seconds.
Determines the current CPU and ldom
Checks with other ranks in the MPI job on the host for
oversubscription by using a "shm" segment created by mpirun and a
lock to communicate with other ranks. If no oversubscription occurs
on the current CPU, then lock the process to the ldom of that CPU. If
there is already a rank reserved on the current CPU, then find a new
CPU based on least loaded free CPUs and lock the process to the
ldom of that CPU.
Similar results can be accomplished using "mpsched" but the procedure
outlined above has the advantage of being a more load-based
distribution, and works well in psets and across multiple machines.
HP-MPI supports CPU binding with a variety of binding strategies (see
below). The option -cpu_bind is supported in appfile, command line, and
srun modes.
% mpirun -cpu_bind[_mt]=[v,][option][,v] -np \ 4 a.out
Where _mt implies thread aware CPU binding; v, and ,v request
verbose information on threads binding to CPUs; and [option] is one of:
rank — Schedule ranks on CPUs according to packed rank id.
map_cpu — Schedule ranks on CPUs in cyclic distribution through MAP
variable.
mask_cpu — Schedule ranks on CPU masks in cyclic distribution
through MAP variable.