HP-MPI V2.3 for Linux Release Note

2 Known Problems and Workarounds
When running on iWARP hardware, users might see messages similar the following
when applications exit:
disconnect: ID 0x2b65962b2b10 ret 22
This is a debugging message which is being printed erroneously by the uDAPL
library and can be safely ignored. The message can be completely suppressed by
passing the -e DAPL_DBG_TYPE=0 option to mpirun. Alternatively, you can set
DAPL_DBG_TYPE=0 in the $MPI_ROOT/etc/hpmpi.conf file to avoid having
to pass the option on the mpirun command line.
Users might see the following error during launch of HP-MPI applications on
Chelsio iWARP hardware:
Rank 0:0: MPI_Init: dat_evd_wait()1 unexpected event number 16392
Rank 0:0: MPI_Init: MPI BUG: Processes cannot connect to rdma device
MPI Application rank 0 exited before MPI_Finalize() with status 1
To prevent these errors, Chelsio recommends passing the peer2peer=1 parameter
to the iw_cxgb3 kernel module. This is accomplished by running the following
commands as root on all nodes:
# echo "1" > /sys/module/iw_cxgb3/parameters/peer2peer
# echo "options iw_cxgb3 peer2peer=1" >> /etc/modprobe.conf
The second command is optional and makes the setting persist across a system
reboot.
Users of iWARP hardware might see errors similar to the following:
dapl async_event QP (0x2b27fdc10d30) ERR 1 dapl_evd_qp_async_error_callback() IB async QP err
- ctx=0x2b27fdc10d30
Previous versions of HP-MPI documented that passing -e MPI_UDAPL_MSG1=1
was necessary on some iWARP hardware. As of HP-MPI V2.3, no iWARP
implementations are known to require this setting and it should be removed from
all scripts unless otherwise instructed.
At the time of this release, Chelsio uDAPL has a limitation that one-sided operations
are only implemented for off-host transfers, not data transfers between ranks on
the same host. On a Chelsio system, the symptom for this problem resembles the
following:
dapl_cma_connect: rdma_connect ERR -1 Function not implemented
If this happens, add the setting -e MPI_UDAPL_READ=0 to the mpirun command
line. This will result in HP-MPI only using uDAPL for the regular non-one-sided
data transfers, and using a slower TCP-based communication method for one-sided
operations.
25