HP-MPI Version 2.2 for Linux Release Note

HP-MPI V2.2 for Linux Release Note
What’s in This Version
16
New mpirun options
The default setting varies depending on the interconnect and rank count. See “New
Environment Variables” on page 20 for more details on default selections.
-srq specifies use of the shared receiving queue protocol when Mellanox VAPI or uDAPL
V1.1 or V1.2 interfaces are used. This protocol uses less pre-pinned memory for short message
transfer.
-rdma specifies use of envelope pairs for short message transfer. The pre-pinned memory will
increase continuously with the job size.
MPI-2 supported ROMIO
HP-MPI 2.2 includes a new version of ROMIO which implements true MPI-2 functionality
with regards to asynchronous writing and reading of files. If existing applications utilize the
asynchronous completion routine MPIO_Wait, then users may be required to recompile those
applications and require that their customers upgrade to the current version of HP-MPI.
NOTE ROMIO is only supported when using the default libmpi library. ROMIO cannot
be used with the multi threaded or diagnostic libraries.
CPU bind support
HP-MPI 2.2 supports CPU binding with a variety of binding strategies (see below). The option
-cpu_bind is supported in appfile, command line, and srun modes.
% mpirun -cpu_bind[_mt]=[v,][option][,v] -np 4 a.out
Where _mt implies thread aware CPU binding; v, and ,v are verbose information on threads
binding to CPUs; and [option] is one of:
rank Schedule ranks on CPUs according to packed rank id.
map_cpu Schedule ranks on CPUs in cycle through MAP variable.
mask_cpu Schedule ranks on CPU masks in cycle through MAP variable.
ll Bind each rank to CPU each is currently running on.
For NUMA-based systems, the following options are also available:
ldom Schedule ranks on ldoms according to packed rank id.
cyclic Cyclic dist on each ldom according to packed rank id.
block Block dist on each ldom according to packed rank id.
rr Same as cyclic, but consider ldom load average.