HP XC System Software Release Notes for Version 2.1
12.3.4 ELAN TRAP Queue Error Seen on Some Quadrics MPI Applications
Some QsNet
II
MPI applications that generate many concurrent DMA operations
might encounter the following error:
ELAN TRAP -0- Unknown - Queue Error
This error terminates the program, which is believed to be caused by high rates
of ELAN PutGet operations.
It is possible to work around this problem by setting the LIBELAN_PUTGET_THROT-
TLE environment variable to a value lower than its default value of 32.
12.3.5 The qsnet Database May Contain Entries to Nonexistent Switch
Modules
Depending on your system topology, the qsnet diagnostics database may contain
entries to nonexistent switches.
This issue is manifested as errors reported by the /usr/bin/qsctrl utility
similar to the following:
# qsctrl
qsctrl: failed to initialise module QR0N03: no such module (-7)
...
In the previous example, the switch_modules table in the qsnet database is
populated with QR0N03 even though the QR0N03 module is not physically present.
This problem has been reported to Quadrics, Ltd.
To work around this problem, delete the QR0N03 entry (and any other nonexistent
switch entries) from the switch_modules table, and restart the swmlogger
service:
# mysql -u root -p qsnet
mysql> delete from switch_modules where name="QR0N03";
mysql>quit
# service swm restart
In addition to the previous problem, the IP address of a switch module may be
incorrectly populated in the switch_modules table, and you might see the
following message:
# qsctrl
qsctrl: failed to parse module name 172.20.66.2
...
Resolve this issue by deleting the IP address from the switch_modules table
and restarting the swmlogger service:
# mysql -u root -p qsnet
mysql> delete from switch_modules where name="172.20.66.2";
mysql>quit
# service swm restart
_________________________ Note _________________________
You must repeat the previous procedure if you rerun the
cluster_config utility because the qsnet database is recreated
during a cluster_config operation.
Interconnect Notes 12-3