Troubleshooting guide
Table of Contents
I. Introduction.......................................................................................................................................... 3
II. What Hardware Is Required?............................................................................................................... 3
III. Hardware Installation .......................................................................................................................... 3
IV. What Software Do I Need To Install? ............................................................................................... 12
V. MX-2G Software Installation .............................................................................................................. 13
1. Configuring and compiling MX-2G.................................................................................................. 13
2. Installing the MX-2G mcp and driver. ............................................................................................. 14
3. Enabling IP over Myrinet (Ethernet emulation) (OPTIONAL)...................................................... 18
VI. GM-2 Software Installation................................................................................................................ 18
1. Configuring and compiling GM-2..................................................................................................... 18
2. Installing the GM-2 driver................................................................................................................. 19
3. Enabling IP over Myrinet (Ethernet emulation) (OPTIONAL)...................................................... 22
VII. GM-1 Software Installation.............................................................................................................. 23
1. Configuring and compiling GM-1..................................................................................................... 23
2. Installing the GM-1 driver................................................................................................................. 23
3. Run the GM-1 mapper....................................................................................................................... 25
4. Enabling IP over Myrinet (Ethernet emulation) (OPTIONAL)...................................................... 27
VIII. Testing/Validation............................................................................................................................ 28
2. Run fm_switch to ensure that the FMS database includes all switches......................................... 28
3. Run fm_db2wirelist to look for any missing hosts.......................................................................... 29
4. Check the LEDs on each switch port and NIC port ....................................................................... 29
5. Test performance between each host and NIC ............................................................................... 30
7. Run mpi_stress or gm_stress to stress all of the connections in the Myrinet fabric..................... 31
8. Run fm_show_alerts for diagnostic information on any damaged/failing hardware component.32
Appendix A: Determining if a Problem is Hardware or Software Related........................................... 33
Appendix B: Isolating the Cause of a Hardware Problem...................................................................... 37
B.1. How do I determine if a cable has failed?.................................................................................... 39
B.2. How do I determine if a port on a switch line card has failed?.................................................. 39
B.3. How do I determine if a Myrinet NIC has failed?....................................................................... 40
Appendix C: Troubleshooting Performance............................................................................................ 42
© 2007 Myricom, Inc. DRAFT
2