HP-MPI V2.3 for Linux Release Note
2 Known Problems and Workarounds
• When running on iWARP hardware, users might see messages similar the following
when applications exit:
disconnect: ID 0x2b65962b2b10 ret 22
This is a debugging message which is being printed erroneously by the uDAPL
library and can be safely ignored. The message can be completely suppressed by
passing the -e DAPL_DBG_TYPE=0 option to mpirun. Alternatively, you can set
DAPL_DBG_TYPE=0 in the $MPI_ROOT/etc/hpmpi.conf file to avoid having
to pass the option on the mpirun command line.
• Users might see the following error during launch of HP-MPI applications on
Chelsio iWARP hardware:
Rank 0:0: MPI_Init: dat_evd_wait()1 unexpected event number 16392
Rank 0:0: MPI_Init: MPI BUG: Processes cannot connect to rdma device
MPI Application rank 0 exited before MPI_Finalize() with status 1
To prevent these errors, Chelsio recommends passing the peer2peer=1 parameter
to the iw_cxgb3 kernel module. This is accomplished by running the following
commands as root on all nodes:
# echo "1" > /sys/module/iw_cxgb3/parameters/peer2peer
# echo "options iw_cxgb3 peer2peer=1" >> /etc/modprobe.conf
The second command is optional and makes the setting persist across a system
reboot.
• Users of iWARP hardware might see errors similar to the following:
dapl async_event QP (0x2b27fdc10d30) ERR 1 dapl_evd_qp_async_error_callback() IB async QP err
- ctx=0x2b27fdc10d30
Previous versions of HP-MPI documented that passing -e MPI_UDAPL_MSG1=1
was necessary on some iWARP hardware. As of HP-MPI V2.3, no iWARP
implementations are known to require this setting and it should be removed from
all scripts unless otherwise instructed.
• At the time of this release, Chelsio uDAPL has a limitation that one-sided operations
are only implemented for off-host transfers, not data transfers between ranks on
the same host. On a Chelsio system, the symptom for this problem resembles the
following:
dapl_cma_connect: rdma_connect ERR -1 Function not implemented
If this happens, add the setting -e MPI_UDAPL_READ=0 to the mpirun command
line. This will result in HP-MPI only using uDAPL for the regular non-one-sided
data transfers, and using a slower TCP-based communication method for one-sided
operations.
25