Product specifications
Table Of Contents
- Table of Contents
- 1 Introduction
- 2 Feature Overview
- 3 Step-by-Step Cluster Setup and MPI Usage Checklists
- 4 InfiniPath Cluster Setup and Administration
- Introduction
- Installed Layout
- Memory Footprint
- BIOS Settings
- InfiniPath and OpenFabrics Driver Overview
- OpenFabrics Drivers and Services Configuration and Startup
- Other Configuration: Changing the MTU Size
- Managing the InfiniPath Driver
- More Information on Configuring and Loading Drivers
- Performance Settings and Management Tips
- Host Environment Setup for MPI
- Checking Cluster and Software Status
- 5 Using QLogic MPI
- Introduction
- Getting Started with MPI
- QLogic MPI Details
- Use Wrapper Scripts for Compiling and Linking
- Configuring MPI Programs for QLogic MPI
- To Use Another Compiler
- Process Allocation
- mpihosts File Details
- Using mpirun
- Console I/O in MPI Programs
- Environment for Node Programs
- Environment Variables
- Running Multiple Versions of InfiniPath or MPI
- Job Blocking in Case of Temporary InfiniBand Link Failures
- Performance Tuning
- MPD
- QLogic MPI and Hybrid MPI/OpenMP Applications
- Debugging MPI Programs
- QLogic MPI Limitations
- 6 Using Other MPIs
- A mpirun Options Summary
- B Benchmark Programs
- C Integration with a Batch Queuing System
- D Troubleshooting
- Using LEDs to Check the State of the Adapter
- BIOS Settings
- Kernel and Initialization Issues
- OpenFabrics and InfiniPath Issues
- Stop OpenSM Before Stopping/Restarting InfiniPath
- Manual Shutdown or Restart May Hang if NFS in Use
- Load and Configure IPoIB Before Loading SDP
- Set $IBPATH for OpenFabrics Scripts
- ifconfig Does Not Display Hardware Address Properly on RHEL4
- SDP Module Not Loading
- ibsrpdm Command Hangs when Two Host Channel Adapters are Installed but Only Unit 1 is Connected to the Switch
- Outdated ipath_ether Configuration Setup Generates Error
- System Administration Troubleshooting
- Performance Issues
- QLogic MPI Troubleshooting
- Mixed Releases of MPI RPMs
- Missing mpirun Executable
- Resolving Hostname with Multi-Homed Head Node
- Cross-Compilation Issues
- Compiler/Linker Mismatch
- Compiler Cannot Find Include, Module, or Library Files
- Problem with Shell Special Characters and Wrapper Scripts
- Run Time Errors with Different MPI Implementations
- Process Limitation with ssh
- Number of Processes Exceeds ulimit for Number of Open Files
- Using MPI.mod Files
- Extending MPI Modules
- Lock Enough Memory on Nodes When Using a Batch Queuing System
- Error Creating Shared Memory Object
- gdb Gets SIG32 Signal Under mpirun -debug with the PSM Receive Progress Thread Enabled
- General Error Messages
- Error Messages Generated by mpirun
- MPI Stats
- E Write Combining
- F Useful Programs and Files
- G Recommended Reading
- Glossary
- Index

D–Troubleshooting
Performance Issues
IB6054601-00 H D-11
A
Erratic Performance
Sometimes erratic performance is seen on applications that use interrupts. An
example is inconsistent SDP latency when running a program such as netperf.
This may be seen on AMD-based systems using the QLE7240 or QLE7280
adapters. If this happens, check to see if the program irqbalance is running.
This program is a Linux daemon that distributes interrupts across processors.
However, it may interfere with prior interrupt request (IRQ) affinity settings,
introducing timing anomalies. After stopping this process (as a root user), bind
IRQ to a CPU for more consistent performance. First, stop irqbalance:
# /sbin/chkconfig irqbalance off
# /etc/init.d/irqbalance stop
Next, find the IRQ number and bind it to a CPU. The IRQ number can be found in
one of two ways, depending on the system used. Both methods are described in
the following paragraphs.
Method 1
Check to see if the IRQ number is found in /proc/irq/xxx, where xxx is the
IRQ number in /sys/class/infiniband/ipath*/device/irq. Do this as a
root user. For example:
# my_irq=‘cat /sys/class/infiniband/ipath*/device/irq‘
# ls /proc/irq
If $my_irq can be found under /proc/irq/, then type:
# echo 01 > /proc/irq/$my_irq/smp_affinity
Method 2
If command from Method 1, ls /proc/irq, cannot find $my_irq, then use the
following commands instead:
# my_irq=‘cat /proc/interrupts|grep ib_ipath|awk \
’{print $1}’|sed -e ’s/://’‘
# echo 01 > /proc/irq/$my_irq/smp_affinity
This method is not the first choice because, on some systems, there may be two
rows of ib_ipath output, and you will not know which of the two numbers to
choose. However, if you cannot find $my_irq listed under /proc/irq
(Method 1), this type of system most likely has only one line for ib_ipath listed
in /proc/interrupts, so you can use Method 2.
NOTE:
Take care when cutting and pasting commands from PDF documents, as
quotes are special characters and may not be translated correctly.