HP-MPI V2.3 for Linux Release Note
-ha:detect, MPI_Errhandler must be set to MPI_ERRORS_RETURN using the
MPI_Comm_set_errhandler function. When an error is detected in a communication,
the error class MPI_ERR_EXITED is returned for the affected communication. Shared
memory is not used for communication between processes.
Only IBV and TCP are supported. This mode cannot be used with the diagnostic library.
1.2.7.7.9 Clarification of the Functionality of Completion Routines in High-Availability Mode
Requests that cannot be completed because of network or process failures will result
in the creation or completion functions returning with the error code MPI_ERR_EXITED.
When waiting or testing multiple requests using MPI_Testany(), MPI_Testsome(),
MPI_Waitany() or MPI_Waitsome(), a request that cannot be completed because
of network or process failures is considered a completed request and causes these
routines to return with the flag or outcount argument set to non-zero. If some requests
completed successfully and some requests completed because of network or process
failure, the return value of the routine is MPI_ERR_IN_STATUS. The status array
elements contain MPI_ERR_EXITED for those requests that completed because of
network or process failure.
IMPORTANT: When waiting on a receive request which uses MPI_ANY_SOURCE on
an intracommunicator, the request is never considered complete because the rank that
created the receive request can legally match it. For intercommunicators, after all
processes in the remote group are unavailable, the request is considered complete and
the MPI_ERROR field of the MPI_Status() structure indicates MPI_ERR_EXITED.
MPI_Waitall() waits until all requests are complete, even if an error occurs with
some requests. If some requests fail, MPI_IN_STATUS is returned. Otherwise,
MPI_SUCCESS is returned. In the case of an error, the error code is returned in the
status array.
1.2.7.8 Enhanced InfiniBand Support for Dynamic Processes
This release supports the use of InfiniBand between processes in different MPI worlds.
Processes that are not part of the same MPI world, but are introduced through calls to
MPI_Comm_connect(), MPI_Comm_accept(), MPI_Comm_spawn(), or
MPI_Comm_spawn_multiple() attempt to use InfiniBand for communication. Both
sides need to have InfiniBand support enabled and use the same InfiniBand parameter
settings, otherwise TCP will be used for the connection. Only OFED IBV protocol is
supported for these connections. When the connection is established through one of
these MPI calls, a TCP connection is first established between the root process of both
sides. TCP connections are setup among all the processes. Finally, IBV InfiniBand
connections are established among all process-pairs and the TCP connections are closed.
20 Information About This Release