HP-MPI Version 2.2.7 for Linux Release Note
• uDAPL may experience issues on multi-card systems. uDAPL will function on the second
card if it is on a separate subnet. But, the second card will not work if it is on the same subnet
as the first card. For multi-card systems set:
/sbin/sysctl -w net.ipv4.conf.all.arp_ignore=2
• The nofile limit on large linux clusters needs to be increased in
/etc/security/limits.conf
* soft nofile 1024
For larger clusters, HP recommends a setting of at least:
— 2048 for clusters of 1900 cores or fewer
— 4096 for clusters of 3800 cores or fewer
— 8192 for clusters of 7600 cores or fewer
— And so on
• To use appfiles on HP XC Quadrics clusters, set MPI_USESRUN=1. The appfile can only
differ in host name and rank count.
• On Quadrics interconnected clusters, the repeated use of MPI_Bcast within a tight loop
can cause an application to fail with the following Elan trap message:
ELAN TRAP - 0 0 CPROC - Bad Trap
Status=lbb40005 CommandProcSendTransExpected Command=200000201
Setting the environment variable LIBELAN_GROUP_SANDF=0 disables the latest “Store and
Forward” broadcast optimization from Quadrics while preserving all the other optimized
collectives.
• The SilverStorm™ uDAPL driver has an accumulating issue. If the system has been running
for more than 24 hours, and a large enough number of applications have been run, new
applications might have problems establishing new uDAPL connections. The error occurs
depending on the usage of the system. If this error occurs, reboot the system.
• Some older versions of Myrinet MX have a known resource limitation involving outstanding
MPI_Issend() calls. If more than 128 MPI_Issend() calls are issued and not yet matched,
further MX communication can hang. The only known workaround is to have your
application issue less than 128 unmatched MPI_Issend() calls at a time. This limitation is
known to be fixed in versions 1.1.8 and later.
• When a foreground HP-MPI job is run from a shell window, if the shell is terminated, the
shell sends signal SIGHUP to the mpirun process and its underlying ssh processes, thus
killing the entire job.
When a background HP-MPI job is run and the shell is terminated, the job might continue
depending on the actual shell used. For /bin/bash, the job is killed. For /bin/sh and
/bin/ksh, the job continues. If nohup is used when launching the job, only background
ksh jobs can continue. This behavior might vary depending on your system.
• Interval timer functionality used by HP-MPI on HP XC can conflict with gprof data collection
phase requirements. Set the following two environment variables to workaround this issue:
% export MPI_FLAGS=s0
% export GMON_OUT_PREFIX=/tmp/app_name
In the above example, setting MPI_FLAGS disables the HP-MPI conflicting use of interval
timers. Refer to the mpienv(1) manpage for descriptions of MPI_FLAG settings. Because this
setting also disables message progression monitoring, use it with well-behaved programs
only.
In the above example, the second setting causes gprof data collection files to be named
/tmp/app_name.PID (where PID is the process ID number). The prefix is set arbitrarily
and makes the file unique in cases where the same PID is given on different nodes.
21