HP-MPI Version 2.3.1 for Linux Release Note
Table Of Contents
- HP-MPI V2.3.1 for Linux Release Note
- Table of Contents
- 1 Information About This Release
- 2 New or Changed Features in V2.3.1
- 3 New or Changed Features in V2.3
- 3.1 Options Supported Only on HP Hardware
- 3.2 System Check
- 3.3 Default Message Size Changed For -ndd
- 3.4 MPICH2 Compatibility
- 3.5 Support for Large Messages
- 3.6 Redundant License Servers
- 3.7 License Release/Regain on Suspend/Resume
- 3.8 Expanded Functionality for -ha
- 3.8.1 Support for High Availability on InfiniBand Verbs
- 3.8.2 Highly Available Infrastructure (-ha:infra)
- 3.8.3 Using MPI_Comm_connect and MPI_Comm_accept
- 3.8.4 Using MPI_Comm_disconnect
- 3.8.5 Instrumentation and High Availability Mode
- 3.8.6 Failure Recover (-ha:recover)
- 3.8.7 Network High Availability (-ha:net)
- 3.8.8 Failure Detection (-ha:detect)
- 3.8.9 Clarification of the Functionality of Completion Routines in High Availability Mode
- 3.9 Enhanced InfiniBand Support for Dynamic Processes
- 3.10 Singleton Launching
- 3.11 Using the -stdio=files Option
- 3.12 Using the -stdio=none Option
- 3.13 Expanded Lightweight Instrumentation
- 3.14 The api option to MPI_INSTR
- 3.15 New mpirun option -xrc
- 4 Known Issues and Workarounds
- 4.1 Running on iWarp Hardware
- 4.2 Running with Chelsio uDAPL
- 4.3 Mapping Ranks to a CPU
- 4.4 OFED Firmware
- 4.5 Spawn on Remote Nodes
- 4.6 Default Interconnect for -ha Option
- 4.7 Linking Without Compiler Wrappers
- 4.8 Locating the Instrumentation Output File
- 4.9 Using the ScaLAPACK Library
- 4.10 Increasing Shared Memory Segment Size
- 4.11 Using MPI_FLUSH_FCACHE
- 4.12 Using MPI_REMSH
- 4.13 Increasing Pinned Memory
- 4.14 Disabling Fork Safety
- 4.15 Using Fork with OFED
- 4.16 Memory Pinning with OFED 1.2
- 4.17 Upgrading to OFED 1.2
- 4.18 Increasing the nofile Limit
- 4.19 Using appfiles on HP XC Quadrics
- 4.20 Using MPI_Bcast on Quadrics
- 4.21 MPI_Issend Call Limitation on Myrinet MX
- 4.22 Terminating Shells
- 4.23 Disabling Interval Timer Conflicts
- 4.24 libpthread Dependency
- 4.25 Fortran Calls Wrappers
- 4.26 Bindings for C++ and Fortran 90
- 4.27 Using HP Caliper
- 4.28 Using -tv
- 4.29 Extended Collectives with Lightweight Instrumentation
- 4.30 Using -ha with Diagnostic Library
- 4.31 Using MPICH with Diagnostic Library
- 4.32 Using -ha with MPICH
- 4.33 Using MPI-2 with Diagnostic Library
- 4.34 Quadrics Memory Leak
- 5 Installation Information
- 6 Licensing Information
- 7 Additional Product Information
If this happens, add the setting -e MPI_UDAPL_READ=0 to the mpirun command
line. This results in HP-MPI only using uDAPL for the regular non-one-sided data
transfers, and using a slower TCP-based communication method for one-sided
operations.
4.3 Mapping Ranks to a CPU
When mapping ranks to a CPU, the ordering of the CPUs relative to the locality domain
(ldom)/socket can vary depending on the architecture and operating system. This
ordering is not consistent, and therefore the MAP_CPU order for one system may not
be the same for a different hardware platform or operating system.
Use the appropriate block (block, block_cpu, fill) or cyclic (cyclic,
cyclic_cpu, rr) binding order to correctly bind the ranks to the same ldom/socket
across all architectures and operating systems.
If MAP_CPU is used, cpu_bind maps a CPU number equal to what is specified in the
system information, such as the /proc/cpuinfo file. Use the ,v option to verify that
the selected CPU ordering has the desired effect.
For example, to run on a 2 socket, quad core machine running Linux RHEL 5:
% mpirun -cpu_bind=block_cpu,v -np 8 hello_world.exe
MPI_CPU_AFFINITY set to BLOCK, setting affinity of rank 0 pid 15374
on host mpixbl01 to ldom 0 (0) (0)
MPI_CPU_AFFINITY set to BLOCK, setting affinity of rank 1 pid 15375
on host mpixbl01 to ldom 0 (0) (2)
MPI_CPU_AFFINITY set to BLOCK, setting affinity of rank 2 pid 15376
on host mpixbl01 to ldom 0 (0) (4)
MPI_CPU_AFFINITY set to BLOCK, setting affinity of rank 3 pid 15377
on host mpixbl01 to ldom 0 (0) (6)
MPI_CPU_AFFINITY set to BLOCK, setting affinity of rank 4 pid 15378
on host mpixbl01 to ldom 1 (1) (1)
MPI_CPU_AFFINITY set to BLOCK, setting affinity of rank 5 pid 15379
on host mpixbl01 to ldom 1 (1) (3)
MPI_CPU_AFFINITY set to BLOCK, setting affinity of rank 6 pid 15380
on host mpixbl01 to ldom 1 (1) (5)
MPI_CPU_AFFINITY set to BLOCK, setting affinity of rank 7 pid 15381
on host mpixbl01 to ldom 1 (1) (7)
Hello world! I'm 5 of 8 on mpixbl01
...
The preceding example shows how the ranks are ordered in relation to the socket (the
first number in the parenthesis) and the CPU ID (the second number in the parenthesis).
The rank placement can be ordered relative to the CPU ID by using the MAP_CPU option,
as follows:
% mpirun -cpu_bind=map_cpu=7,5,3,1,6,4,2,0,v -np 8 hello_world.exe
MPI_CPU_AFFINITY set to MAP_CPU, setting affinity of rank 0 pid 15801
on host mpixbl01 to cpu 7
MPI_CPU_AFFINITY set to MAP_CPU, setting affinity of rank 1 pid 15802
26 Known Issues and Workarounds