HP-MPI Version 2.2.5.1 for Linux Release Note
HP-MPI V2.2.5.1 for Linux Release Note
Known Problems and Workarounds
13
Known Problems and Workarounds
• InfiniBand requires memory to be pinned (locked in memory) for message passing. This
can become a problem when a child process is forked and a pinned page exists in both the
parent's and child's address spaces. Normally a copy-on-write would occur when one of the
processes touches memory on a shared page, and the virtual to physical mapping would
change for that process. In the context of InfiniBand, such a change in the mapping
results in data corruption when an RDMA sends data to the original physical address.
Both OFED 1.2 (with a fork safety mode enabled) and Mellanox avoid this problem by not
using copy-on-write behavior during a fork for pinned pages. Instead, any access to these
pages by the child process will result in a segmentation violation of the child, and the
parent's mapping remains unchanged so the parent is able to continue running normally
with no data corruption.
More specifically, the above description is what Mellanox always does, and is an option
under OFED 1.2. HP-MPI turns this option on by default when the IBV protocol is being
used, but does not automatically turn it on for other protocols like uDAPL. If a specific
behavior is desired, a user can explicitly enable the above fork behavior under OFED 1.2
by setting the environment variable IBV_FORK_SAFE=1. Or, if the fork safety mode is not
desired, it can be turned off with MPI_IBV_NO_FORK_SAFE=1.
If an application is experiencing problems with the child touching pinned pages, HP-MPI
provides a feature that may avoid the scenario from occurring. By setting the environment
variable MPI_PAGE_ALIGN_MEM, HP-MPI will page-align and page-pad libc memory
allocation requests which are large enough to be pinned during MPI message transfer.
This can result in slightly more memory being allocated, but reduces the likelihood that a
forked process will write to a page of memory that was also being used for message
transfer when a fork call occurred.
• Applications running on Linux systems with kernels older than 2.6.12 may encounter the
following warning message:
libibverbs: Warning: fork()-safety requested but init failed
This warning message is due to the HP-MPI library enabling the OFED 1.2 fork safety
feature that is not supported by Linux kernels older than 2.6.12. It does not impact the
application run. To disable HP-MPI fork-safety, set the environment variable
MPI_IBV_NO_FORK_SAFE, as in the following example:
% /opt/hpmpi/bin/mpirun -np 4 -prot -e MPI_IBV_NO_FORK_SAFE=1 \
-hostlist nodea,nodeb,nodec,noded /my/dir/hello_world