HP-MPI Version 2.3.1 for Linux Release Note

Table Of Contents
return from the MPI_Sendrecv_replace() call on commB if their partners are also
members of commA and are in the call to MPI_Comm_dup() call on commA. This
demonstrates the importance of using care when dealing with multiple communicators.
In this example, if the intersection of commA and commB is MPI_COMM_SELF, it is
simpler to write an application that does not deadlock during failure.
The use of the -ha:recover option is available only on HP hardware. Usage on
third-party hardware will result in an error message. On third-party systems, a failed
communicator can continue to be used for point-to-point communication, but no
recovery mechanism is available.
3.8.7 Network High Availability (-ha:net)
The net option to -ha turns on any network high availability. Network high availability
attempts to insulate an application from errors in the network. In this release, -ha:net
is only significant on IBV for OFED 1.2 or later, where Automatic Path Migration is
used. This option has no effect on TCP connections.
The use of the -ha:net option is available only on HP hardware. Usage on third-party
hardware results in an error message.
3.8.8 Failure Detection (-ha:detect)
When using the -ha:detect option, a communication failure is detected and prevents
interference with the application's ability to communicate with other processes that
have not been affected by the failure. In addition to specifying -ha:detect,
MPI_Errhandler must be set to MPI_ERRORS_RETURN using the
MPI_Comm_set_errhandler function. When an error is detected in a communication,
the error class MPI_ERR_EXITED is returned for the affected communication. Shared
memory is not used for communication between processes.
Only IBV and TCP are supported. This mode cannot be used with the diagnostic library.
3.8.9 Clarification of the Functionality of Completion Routines in High Availability Mode
Requests that cannot be completed because of network or process failures result in the
creation or completion functions returning with the error code MPI_ERR_EXITED.
When waiting or testing multiple requests using MPI_Testany(), MPI_Testsome(),
MPI_Waitany() or MPI_Waitsome(), a request that cannot be completed because
of network or process failures is considered a completed request and these routines
return with the flag or outcount argument set to non-zero. If some requests completed
successfully and some requests completed because of network or process failure, the
return value of the routine is MPI_ERR_IN_STATUS. The status array elements contain
MPI_ERR_EXITED for those requests that completed because of network or process
failure.
3.8 Expanded Functionality for -ha 21