Product specifications
Table Of Contents
- Table of Contents
- 1 Introduction
- 2 Feature Overview
- 3 Step-by-Step Cluster Setup and MPI Usage Checklists
- 4 InfiniPath Cluster Setup and Administration
- Introduction
- Installed Layout
- Memory Footprint
- BIOS Settings
- InfiniPath and OpenFabrics Driver Overview
- OpenFabrics Drivers and Services Configuration and Startup
- Other Configuration: Changing the MTU Size
- Managing the InfiniPath Driver
- More Information on Configuring and Loading Drivers
- Performance Settings and Management Tips
- Host Environment Setup for MPI
- Checking Cluster and Software Status
- 5 Using QLogic MPI
- Introduction
- Getting Started with MPI
- QLogic MPI Details
- Use Wrapper Scripts for Compiling and Linking
- Configuring MPI Programs for QLogic MPI
- To Use Another Compiler
- Process Allocation
- mpihosts File Details
- Using mpirun
- Console I/O in MPI Programs
- Environment for Node Programs
- Environment Variables
- Running Multiple Versions of InfiniPath or MPI
- Job Blocking in Case of Temporary InfiniBand Link Failures
- Performance Tuning
- MPD
- QLogic MPI and Hybrid MPI/OpenMP Applications
- Debugging MPI Programs
- QLogic MPI Limitations
- 6 Using Other MPIs
- A mpirun Options Summary
- B Benchmark Programs
- C Integration with a Batch Queuing System
- D Troubleshooting
- Using LEDs to Check the State of the Adapter
- BIOS Settings
- Kernel and Initialization Issues
- OpenFabrics and InfiniPath Issues
- Stop OpenSM Before Stopping/Restarting InfiniPath
- Manual Shutdown or Restart May Hang if NFS in Use
- Load and Configure IPoIB Before Loading SDP
- Set $IBPATH for OpenFabrics Scripts
- ifconfig Does Not Display Hardware Address Properly on RHEL4
- SDP Module Not Loading
- ibsrpdm Command Hangs when Two Host Channel Adapters are Installed but Only Unit 1 is Connected to the Switch
- Outdated ipath_ether Configuration Setup Generates Error
- System Administration Troubleshooting
- Performance Issues
- QLogic MPI Troubleshooting
- Mixed Releases of MPI RPMs
- Missing mpirun Executable
- Resolving Hostname with Multi-Homed Head Node
- Cross-Compilation Issues
- Compiler/Linker Mismatch
- Compiler Cannot Find Include, Module, or Library Files
- Problem with Shell Special Characters and Wrapper Scripts
- Run Time Errors with Different MPI Implementations
- Process Limitation with ssh
- Number of Processes Exceeds ulimit for Number of Open Files
- Using MPI.mod Files
- Extending MPI Modules
- Lock Enough Memory on Nodes When Using a Batch Queuing System
- Error Creating Shared Memory Object
- gdb Gets SIG32 Signal Under mpirun -debug with the PSM Receive Progress Thread Enabled
- General Error Messages
- Error Messages Generated by mpirun
- MPI Stats
- E Write Combining
- F Useful Programs and Files
- G Recommended Reading
- Glossary
- Index

D–Troubleshooting
Performance Issues
D-10 IB6054601-00 H
S
The exact symptoms can vary with BIOS, amount of memory, etc. When the driver
starts, you may see these errors:
ib_ipath 0000:04:01.0: infinipath0: Performance problem: bandwidth
to PIO buffers is only 273 MiB/sec
infinipath: mtrr_add(feb00000,0x100000,WC,0) failed (-22)
infinipath: probe of 0000:04:01.0 failed with error -22
If you do not see any of these messages on your console, but suspect this
problem, check the /var/log/messages file. Some systems suppress driver
load messages but still output them to the log file.
To check the bandwidth, type:
$ ipath_pkt_test -B
When configured correctly, the QLE7140 and QLE7240 report in the range
of 1150–1500 MBps, while the QLE7280 reports in the range
of 1950–3000 MBps. The QHT7040/7140 adapters normally report in the range
of 2300–2650 MBps.
You can also use ipath_checkout to check for MTRR problems (see
“ipath_checkout” on page F-7).
The dmesg program (“dmesg” on page F-3) can also be used for diagnostics.
Details on both the PAT and MTRR mechanisms, and how the options should be
set, can be found in “Write Combining” on page E-1.
Large Message Receive Side Bandwidth Varies with Socket
Affinity on Opteron Systems
On Opteron systems, when using the QLE7240 or QLE7280 in DDR mode, there
is a receive side bandwidth bottleneck for CPUs that are not adjacent to the PCI
Express root complex. This may cause performance to vary. The bottleneck is
most obvious when using SendDMA with large messages on the farthest sockets.
The best case for SendDMA is when both sender and receiver are on the closest
sockets. Overall performance for PIO (and smaller messages) is better than with
SendDMA.
MVAPICH Performance Issues
At the time of publication, MVAPICH over OpenFabrics over InfiniPath
performance tuning has not been done. However, if MVAPICH on InfiniPath is
configured to use PSM, performance comparable to QLogic MPI can be obtained.