HP-MPI Version 2.2.5.1 for Linux Release Note
HP-MPI V2.2.5.1 for Linux Release Note
Known Problems and Workarounds
14
• The initial release of OFED 1.2 contains a bug which causes the memory pinning function
to fail after certain patterns of malloc and free. The symptom which is visible from
HP-MPI could be any of several error messages such as:
> prog.x: Rank 0:1: MPI_Get: Unable to pin memory for put/get
This bug has already been fixed in OFED, but if you are running with the initial release of
OFED 1.2, the only workaround is to set MPI_IBV_NO_FORK_SAFE=1.
• When running a 32-bit executable with InfiniBand or Myrinet on a machine with more
than 2GB of memory, HP-MPI incorrectly calculates the amount of memory which can be
pinned (locked in memory) for use in RDMA message passing. This can lead to errors of
the form:
x: Rank 0:0: MPI_Get: Unable to pin memory for put/get
A workaround is to use an extra mpirun command line option:
% mpirun -e MPI_PHYSICAL_MEMORY=4096
This command informs HP-MPI how much memory the system has instead of allowing
HP-MPI calculate it. The units are MB so the above example is telling HP-MPI that the
system has 4GB of memory.
• When using the command line option -e VAR=VAL with mpirun to set environment
variables for all the ranks in the HP-MPI job, it is inconvenient to set up values that
contain spaces.
In HP-MPI 2.2.5.1, the parser will stop when it sees a space in VAL unless it is contained
within single or double quotes.
An example where a value containing a space might be required is ssh -x for MPI_REMSH.
For example, in HP-MPI 2.2.5.1 use:
% mpirun -e 'MPI_REMSH="ssh -x"' ...
The command above ensures that the interior double quotes are seen by HP-MPI's parser,
thus allowing it to assign the correct value to MPI_REMSH.
• When upgrading to OFED V1.2 from older versions, the installation script may not stop
the previous OFED version before uninstalling it. Therefore, it is recommended to stop
the old OFED stack before upgrading to OFED V1.2.
• uDAPL may experience issues on multi-card systems. uDAPL will function on the second
card if it is on a separate subnet. But, the second card will not work if it is on the same
subnet as the first card. For multi-card systems set:
/sbin/sysctl -w net.ipv4.conf.all.arp_ignore=2