Product specifications
Table Of Contents
- Table of Contents
- 1 Introduction
- 2 Feature Overview
- 3 Step-by-Step Cluster Setup and MPI Usage Checklists
- 4 InfiniPath Cluster Setup and Administration
- Introduction
- Installed Layout
- Memory Footprint
- BIOS Settings
- InfiniPath and OpenFabrics Driver Overview
- OpenFabrics Drivers and Services Configuration and Startup
- Other Configuration: Changing the MTU Size
- Managing the InfiniPath Driver
- More Information on Configuring and Loading Drivers
- Performance Settings and Management Tips
- Host Environment Setup for MPI
- Checking Cluster and Software Status
- 5 Using QLogic MPI
- Introduction
- Getting Started with MPI
- QLogic MPI Details
- Use Wrapper Scripts for Compiling and Linking
- Configuring MPI Programs for QLogic MPI
- To Use Another Compiler
- Process Allocation
- mpihosts File Details
- Using mpirun
- Console I/O in MPI Programs
- Environment for Node Programs
- Environment Variables
- Running Multiple Versions of InfiniPath or MPI
- Job Blocking in Case of Temporary InfiniBand Link Failures
- Performance Tuning
- MPD
- QLogic MPI and Hybrid MPI/OpenMP Applications
- Debugging MPI Programs
- QLogic MPI Limitations
- 6 Using Other MPIs
- A mpirun Options Summary
- B Benchmark Programs
- C Integration with a Batch Queuing System
- D Troubleshooting
- Using LEDs to Check the State of the Adapter
- BIOS Settings
- Kernel and Initialization Issues
- OpenFabrics and InfiniPath Issues
- Stop OpenSM Before Stopping/Restarting InfiniPath
- Manual Shutdown or Restart May Hang if NFS in Use
- Load and Configure IPoIB Before Loading SDP
- Set $IBPATH for OpenFabrics Scripts
- ifconfig Does Not Display Hardware Address Properly on RHEL4
- SDP Module Not Loading
- ibsrpdm Command Hangs when Two Host Channel Adapters are Installed but Only Unit 1 is Connected to the Switch
- Outdated ipath_ether Configuration Setup Generates Error
- System Administration Troubleshooting
- Performance Issues
- QLogic MPI Troubleshooting
- Mixed Releases of MPI RPMs
- Missing mpirun Executable
- Resolving Hostname with Multi-Homed Head Node
- Cross-Compilation Issues
- Compiler/Linker Mismatch
- Compiler Cannot Find Include, Module, or Library Files
- Problem with Shell Special Characters and Wrapper Scripts
- Run Time Errors with Different MPI Implementations
- Process Limitation with ssh
- Number of Processes Exceeds ulimit for Number of Open Files
- Using MPI.mod Files
- Extending MPI Modules
- Lock Enough Memory on Nodes When Using a Batch Queuing System
- Error Creating Shared Memory Object
- gdb Gets SIG32 Signal Under mpirun -debug with the PSM Receive Progress Thread Enabled
- General Error Messages
- Error Messages Generated by mpirun
- MPI Stats
- E Write Combining
- F Useful Programs and Files
- G Recommended Reading
- Glossary
- Index

D–Troubleshooting
OpenFabrics and InfiniPath Issues
IB6054601-00 H D-7
A
Manual Shutdown or Restart May Hang if NFS in Use
If you are using NFS over IPoIB and use the manual /etc/init.d/openibd
stop (or restart) command, the shutdown process may silently hang on the
fuser command contained within the script. This is because fuser cannot
traverse down the tree from the mount point once the mount point has
disappeared. To remedy this problem, the fuser process itself needs to be killed.
Run the following command either as a root user or as the user who is running the
fuser process:
# kill -9 fuser
The shutdown will continue.
This problem is not seen if the system is rebooted or if the filesystem has already
been unmounted before stopping infinipath.
Load and Configure IPoIB Before Loading SDP
SDP generates Connection Refused errors if it is loaded before IPoIB has been
loaded and configured. To solve the problem, load and configure IPoIB first.
Set $IBPATH for OpenFabrics Scripts
The environment variable $IBPATH must be set to /usr/bin. If this has not been
set, or if you have it set to a location other than the installed location, you may see
error messages similar to the following when running some OpenFabrics scripts:
/usr/bin/ibhosts: line 30: /usr/local/bin/ibnetdiscover: No such
file or directory
For the OpenFabrics commands supplied with this InfiniPath release, set the
variable (if it has not been set already) to /usr/bin, as follows:
$ export IBPATH=/usr/bin
ifconfig Does Not Display Hardware Address Properly on
RHEL4
The ifconfig command can verify IPoIB network interface configuration.
However, ifconfig does not report the hardware address (HWaddr) properly on
RHEL4 U4 machines. In the following example, all zeroes are returned:
# ifconfig ib0
ib0 Link encap:UNSPEC HWaddr
00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
.
.
.
As a workaround, use this command to display the hardware address:
# ip addr