Troubleshooting guide
• Is there a monitoring line card installed in each Myrinet-2000 switch? If yes,
do you see a high number of bad crcs reported in the switch counters?
If you're using a Myrinet-2000 M3-E* switch, this information can be
obtained with the following command:
lynx –dump <switch_ip_address>/all | grep badCrcs
If you're using a Myrinet-2000 M3-CLOS-ENCL or M3-SPINE-ENCL
switch, this information can be obtained with the following command:
lynx –dump <switch_ip_address>/cgi/web.cgi\?all | grep badCrcs
• Are there non-zero values of switch traps related to overheating, etc? Refer to
"What is the meaning of each of the trap counts reported by the Myrinet-2000
M3-E* switch?" (http://www.myri.com/cgi-bin/fom?file=206), and for the
Myrinet-2000 M3-CLOS-ENCL/M3-SPINE-ENCL switches, refer to the Switch
Tutorial (http://www.myri.com/scs/14U_switches/).
• If you installed FMS, does the output of fm_status list all nodes, and does it say
that the network is fully configured?
• If you are unable to install FMS, does the output of mx_info or gm_board_info
list all nodes in the routing table, and say that the Myrinet network is fully
configured? If one of the nodes is missing from the routing/mapping table, refer
to the diagnostic procedures in "How can I tell if the MX Mapper has correctly
detected all of the hosts in my Myrinet network?" (http://www.myri.com/cgi-
bin/fom?file=427), or "How can I tell if the GM-2 Mapper has correctly detected
all of the hosts in my Myrinet network?" (http://www.myri.com/cgi-
bin/fom?file=273), or "How can I tell if the GM-1 Mapper has correctly detected
all of the hosts in my Myrinet network?" (http://www.myri.com/cgi-
bin/fom?file=127).
If you are using the Myrinet-2000 M3-CLOS-ENCL and M3-SPINE-ENCL
switches, and a particular switch port is unable to communicate, is the switch port
reported as out-of-sync (http://www.myri.com/scs/14U_switches/index-overview-
web.html #sync)? Refer to "One of the connected switch ports is not illuminated
in green." (
http://www.myri.com/scs/14U_switches/#tft-green) for full details.
• Do all nodes report similar performance for mx_dmabench or gm_debug -L?
Refer to the subsection entitled “3. Run mx_dmabench or gm_debug to test the
PCI bandwidth” (page 31) in Section VIII Testing/Validation for a discussion of
diagnostic procedures to isolate the cause of an inconsistency.
© 2007 Myricom, Inc. DRAFT
34