HP-MPI Version 2.3.1 for Linux Release Note
Table Of Contents
- HP-MPI V2.3.1 for Linux Release Note
- Table of Contents
- 1 Information About This Release
- 2 New or Changed Features in V2.3.1
- 3 New or Changed Features in V2.3
- 3.1 Options Supported Only on HP Hardware
- 3.2 System Check
- 3.3 Default Message Size Changed For -ndd
- 3.4 MPICH2 Compatibility
- 3.5 Support for Large Messages
- 3.6 Redundant License Servers
- 3.7 License Release/Regain on Suspend/Resume
- 3.8 Expanded Functionality for -ha
- 3.8.1 Support for High Availability on InfiniBand Verbs
- 3.8.2 Highly Available Infrastructure (-ha:infra)
- 3.8.3 Using MPI_Comm_connect and MPI_Comm_accept
- 3.8.4 Using MPI_Comm_disconnect
- 3.8.5 Instrumentation and High Availability Mode
- 3.8.6 Failure Recover (-ha:recover)
- 3.8.7 Network High Availability (-ha:net)
- 3.8.8 Failure Detection (-ha:detect)
- 3.8.9 Clarification of the Functionality of Completion Routines in High Availability Mode
- 3.9 Enhanced InfiniBand Support for Dynamic Processes
- 3.10 Singleton Launching
- 3.11 Using the -stdio=files Option
- 3.12 Using the -stdio=none Option
- 3.13 Expanded Lightweight Instrumentation
- 3.14 The api option to MPI_INSTR
- 3.15 New mpirun option -xrc
- 4 Known Issues and Workarounds
- 4.1 Running on iWarp Hardware
- 4.2 Running with Chelsio uDAPL
- 4.3 Mapping Ranks to a CPU
- 4.4 OFED Firmware
- 4.5 Spawn on Remote Nodes
- 4.6 Default Interconnect for -ha Option
- 4.7 Linking Without Compiler Wrappers
- 4.8 Locating the Instrumentation Output File
- 4.9 Using the ScaLAPACK Library
- 4.10 Increasing Shared Memory Segment Size
- 4.11 Using MPI_FLUSH_FCACHE
- 4.12 Using MPI_REMSH
- 4.13 Increasing Pinned Memory
- 4.14 Disabling Fork Safety
- 4.15 Using Fork with OFED
- 4.16 Memory Pinning with OFED 1.2
- 4.17 Upgrading to OFED 1.2
- 4.18 Increasing the nofile Limit
- 4.19 Using appfiles on HP XC Quadrics
- 4.20 Using MPI_Bcast on Quadrics
- 4.21 MPI_Issend Call Limitation on Myrinet MX
- 4.22 Terminating Shells
- 4.23 Disabling Interval Timer Conflicts
- 4.24 libpthread Dependency
- 4.25 Fortran Calls Wrappers
- 4.26 Bindings for C++ and Fortran 90
- 4.27 Using HP Caliper
- 4.28 Using -tv
- 4.29 Extended Collectives with Lightweight Instrumentation
- 4.30 Using -ha with Diagnostic Library
- 4.31 Using MPICH with Diagnostic Library
- 4.32 Using -ha with MPICH
- 4.33 Using MPI-2 with Diagnostic Library
- 4.34 Quadrics Memory Leak
- 5 Installation Information
- 6 Licensing Information
- 7 Additional Product Information
return from the MPI_Sendrecv_replace() call on commB if their partners are also
members of commA and are in the call to MPI_Comm_dup() call on commA. This
demonstrates the importance of using care when dealing with multiple communicators.
In this example, if the intersection of commA and commB is MPI_COMM_SELF, it is
simpler to write an application that does not deadlock during failure.
The use of the -ha:recover option is available only on HP hardware. Usage on
third-party hardware will result in an error message. On third-party systems, a failed
communicator can continue to be used for point-to-point communication, but no
recovery mechanism is available.
3.8.7 Network High Availability (-ha:net)
The net option to -ha turns on any network high availability. Network high availability
attempts to insulate an application from errors in the network. In this release, -ha:net
is only significant on IBV for OFED 1.2 or later, where Automatic Path Migration is
used. This option has no effect on TCP connections.
The use of the -ha:net option is available only on HP hardware. Usage on third-party
hardware results in an error message.
3.8.8 Failure Detection (-ha:detect)
When using the -ha:detect option, a communication failure is detected and prevents
interference with the application's ability to communicate with other processes that
have not been affected by the failure. In addition to specifying -ha:detect,
MPI_Errhandler must be set to MPI_ERRORS_RETURN using the
MPI_Comm_set_errhandler function. When an error is detected in a communication,
the error class MPI_ERR_EXITED is returned for the affected communication. Shared
memory is not used for communication between processes.
Only IBV and TCP are supported. This mode cannot be used with the diagnostic library.
3.8.9 Clarification of the Functionality of Completion Routines in High Availability Mode
Requests that cannot be completed because of network or process failures result in the
creation or completion functions returning with the error code MPI_ERR_EXITED.
When waiting or testing multiple requests using MPI_Testany(), MPI_Testsome(),
MPI_Waitany() or MPI_Waitsome(), a request that cannot be completed because
of network or process failures is considered a completed request and these routines
return with the flag or outcount argument set to non-zero. If some requests completed
successfully and some requests completed because of network or process failure, the
return value of the routine is MPI_ERR_IN_STATUS. The status array elements contain
MPI_ERR_EXITED for those requests that completed because of network or process
failure.
3.8 Expanded Functionality for -ha 21