HP-MPI Version 2.3.1 for Linux Release Note
Table Of Contents
- HP-MPI V2.3.1 for Linux Release Note
- Table of Contents
- 1 Information About This Release
- 2 New or Changed Features in V2.3.1
- 3 New or Changed Features in V2.3
- 3.1 Options Supported Only on HP Hardware
- 3.2 System Check
- 3.3 Default Message Size Changed For -ndd
- 3.4 MPICH2 Compatibility
- 3.5 Support for Large Messages
- 3.6 Redundant License Servers
- 3.7 License Release/Regain on Suspend/Resume
- 3.8 Expanded Functionality for -ha
- 3.8.1 Support for High Availability on InfiniBand Verbs
- 3.8.2 Highly Available Infrastructure (-ha:infra)
- 3.8.3 Using MPI_Comm_connect and MPI_Comm_accept
- 3.8.4 Using MPI_Comm_disconnect
- 3.8.5 Instrumentation and High Availability Mode
- 3.8.6 Failure Recover (-ha:recover)
- 3.8.7 Network High Availability (-ha:net)
- 3.8.8 Failure Detection (-ha:detect)
- 3.8.9 Clarification of the Functionality of Completion Routines in High Availability Mode
- 3.9 Enhanced InfiniBand Support for Dynamic Processes
- 3.10 Singleton Launching
- 3.11 Using the -stdio=files Option
- 3.12 Using the -stdio=none Option
- 3.13 Expanded Lightweight Instrumentation
- 3.14 The api option to MPI_INSTR
- 3.15 New mpirun option -xrc
- 4 Known Issues and Workarounds
- 4.1 Running on iWarp Hardware
- 4.2 Running with Chelsio uDAPL
- 4.3 Mapping Ranks to a CPU
- 4.4 OFED Firmware
- 4.5 Spawn on Remote Nodes
- 4.6 Default Interconnect for -ha Option
- 4.7 Linking Without Compiler Wrappers
- 4.8 Locating the Instrumentation Output File
- 4.9 Using the ScaLAPACK Library
- 4.10 Increasing Shared Memory Segment Size
- 4.11 Using MPI_FLUSH_FCACHE
- 4.12 Using MPI_REMSH
- 4.13 Increasing Pinned Memory
- 4.14 Disabling Fork Safety
- 4.15 Using Fork with OFED
- 4.16 Memory Pinning with OFED 1.2
- 4.17 Upgrading to OFED 1.2
- 4.18 Increasing the nofile Limit
- 4.19 Using appfiles on HP XC Quadrics
- 4.20 Using MPI_Bcast on Quadrics
- 4.21 MPI_Issend Call Limitation on Myrinet MX
- 4.22 Terminating Shells
- 4.23 Disabling Interval Timer Conflicts
- 4.24 libpthread Dependency
- 4.25 Fortran Calls Wrappers
- 4.26 Bindings for C++ and Fortran 90
- 4.27 Using HP Caliper
- 4.28 Using -tv
- 4.29 Extended Collectives with Lightweight Instrumentation
- 4.30 Using -ha with Diagnostic Library
- 4.31 Using MPICH with Diagnostic Library
- 4.32 Using -ha with MPICH
- 4.33 Using MPI-2 with Diagnostic Library
- 4.34 Quadrics Memory Leak
- 5 Installation Information
- 6 Licensing Information
- 7 Additional Product Information

3.8.4 Using MPI_Comm_disconnect
In high availability mode, MPI_Comm_disconnect is collective only across the local
group of the calling process. This enables a process group to independently break a
connection to the remote group in an intercommunicator without synchronizing with
those processes. Unreceived messages on the remote side are buffered and might be
received until the remote side calls MPI_Comm_disconnect.
Receive calls that cannot be satisfied by a buffered message fail on the remote processes
after the local processes have called MPI_Comm_disconnect. Send calls on either side
of the intercommunicator fail after either side has called MPI_Comm_disconnect.
3.8.5 Instrumentation and High Availability Mode
HP-MPI lightweight instrumentation is now supported when using -ha and singletons.
In the event that some ranks terminate during or before MPI_Finalize(), then the
lowest rank id in MPI_COMM_WORLD produces the instrumentation output file on behalf
of the application and instrumentation data for the exited ranks is not included. For
other enhancements to instrumentation in this release, see “Expanded Lightweight
Instrumentation” (page 23).
The use of -ha and -i is available only on HP hardware. Usage on third-party hardware
results in an error message.
3.8.6 Failure Recover (-ha:recover)
Fault-Tolerant MPI_Comm_dup() That Excludes Failed Ranks
When using -ha:recover, the functionality of MPI_Comm_dup() enables an
application to recover from errors.
IMPORTANT: The MPI_Comm_dup() function is not standard compliance because a
call to MPI_Comm_dup() always terminates all outstanding communications with
failures on the communicator regardless of the presence or absence of errors.
When one or more pairs of ranks within a communicator are unable to communicate
because a rank has exited or the communication layers have returned errors, a call to
MPI_Comm_dup attempts to return the largest communicator containing ranks that
were fully interconnected at some point during the MPI_Comm_dup call. Because new
errors can occur at any time, the returned communicator might not be completely error
free. However, the two ranks in the original communicator that were unable to
communicate before the call are not included in a communicator generated by
MPI_Comm_dup.
Communication failures can partition ranks into two groups, A and B, so that no rank
in group A can communicate to any rank in group B and vice versa. A call to
MPI_Comm_dup() can behave similarly to a call to MPI_Comm_split(), returning
different legal communicators to different callers. When a larger communicator exists
3.8 Expanded Functionality for -ha 19