Product specifications
Table Of Contents
- Table of Contents
- 1 Introduction
- 2 Feature Overview
- 3 Step-by-Step Cluster Setup and MPI Usage Checklists
- 4 InfiniPath Cluster Setup and Administration
- Introduction
- Installed Layout
- Memory Footprint
- BIOS Settings
- InfiniPath and OpenFabrics Driver Overview
- OpenFabrics Drivers and Services Configuration and Startup
- Other Configuration: Changing the MTU Size
- Managing the InfiniPath Driver
- More Information on Configuring and Loading Drivers
- Performance Settings and Management Tips
- Host Environment Setup for MPI
- Checking Cluster and Software Status
- 5 Using QLogic MPI
- Introduction
- Getting Started with MPI
- QLogic MPI Details
- Use Wrapper Scripts for Compiling and Linking
- Configuring MPI Programs for QLogic MPI
- To Use Another Compiler
- Process Allocation
- mpihosts File Details
- Using mpirun
- Console I/O in MPI Programs
- Environment for Node Programs
- Environment Variables
- Running Multiple Versions of InfiniPath or MPI
- Job Blocking in Case of Temporary InfiniBand Link Failures
- Performance Tuning
- MPD
- QLogic MPI and Hybrid MPI/OpenMP Applications
- Debugging MPI Programs
- QLogic MPI Limitations
- 6 Using Other MPIs
- A mpirun Options Summary
- B Benchmark Programs
- C Integration with a Batch Queuing System
- D Troubleshooting
- Using LEDs to Check the State of the Adapter
- BIOS Settings
- Kernel and Initialization Issues
- OpenFabrics and InfiniPath Issues
- Stop OpenSM Before Stopping/Restarting InfiniPath
- Manual Shutdown or Restart May Hang if NFS in Use
- Load and Configure IPoIB Before Loading SDP
- Set $IBPATH for OpenFabrics Scripts
- ifconfig Does Not Display Hardware Address Properly on RHEL4
- SDP Module Not Loading
- ibsrpdm Command Hangs when Two Host Channel Adapters are Installed but Only Unit 1 is Connected to the Switch
- Outdated ipath_ether Configuration Setup Generates Error
- System Administration Troubleshooting
- Performance Issues
- QLogic MPI Troubleshooting
- Mixed Releases of MPI RPMs
- Missing mpirun Executable
- Resolving Hostname with Multi-Homed Head Node
- Cross-Compilation Issues
- Compiler/Linker Mismatch
- Compiler Cannot Find Include, Module, or Library Files
- Problem with Shell Special Characters and Wrapper Scripts
- Run Time Errors with Different MPI Implementations
- Process Limitation with ssh
- Number of Processes Exceeds ulimit for Number of Open Files
- Using MPI.mod Files
- Extending MPI Modules
- Lock Enough Memory on Nodes When Using a Batch Queuing System
- Error Creating Shared Memory Object
- gdb Gets SIG32 Signal Under mpirun -debug with the PSM Receive Progress Thread Enabled
- General Error Messages
- Error Messages Generated by mpirun
- MPI Stats
- E Write Combining
- F Useful Programs and Files
- G Recommended Reading
- Glossary
- Index

5–Using QLogic MPI
Performance Tuning
5-22 IB6054601-00 H
S
Use the taskset utility with mpirun to specify the mapping of MPI processes to
logical processors. This combination makes the best use of available memory
bandwidth or cache locality when running on dual-core Symmetric
MultiProcessing (SMP) cluster nodes.
The following example uses the NASA Advanced Supercomputing (NAS) Parallel
Benchmark’s Multi-Grid (MG) benchmark and the -c option to taskset.
$ mpirun -np 4 -ppn 2 -m $hosts taskset -c 0,2 bin/mg.B.4
$ mpirun -np 4 -ppn 2 -m $hosts taskset -c 1,3 bin/mg.B.4
The first command forces the programs to run on CPUs (or cores) 0 and 2. The
second command forces the programs to run on CPUs 1 and 3. See the taskset
man page for more information on usage.
To turn off CPU affinity, set the environment variable IPATH_NO_CPUAFFINITY.
This environment variable is propagated to node programs by mpirun.
mpirun Tunable Options
There are some mpirun options that can be adjusted to optimize communication.
The most important one is:
-long-len, -L [default: 64000]
This option determines the length of the message that the rendezvous protocol
(instead of the eager protocol) must use. The default value for -L was chosen for
optimal unidirectional communication. Applications that have this kind of traffic
pattern benefit from this higher default value. Other values for -L are appropriate
for different communication patterns and data size. For example, applications that
have bidirectional traffic patterns may benefit from using a lower value.
Experimentation is recommended.
Two other options that are useful are:
-long-len-shmem, -s [default: 16000]
This option determines the length of the message within the rendezvous protocol
(instead of the eager protocol) to be used for intra-node communications. This
option is for messages going through shared memory. The InfiniPath rendezvous
messaging protocol uses a two-way handshake (with MPI synchronous send
semantics) and receive-side DMA.
-rndv-window-size, -W [default: 262144]
When sending a large message using the rendezvous protocol, QLogic MPI splits
it into a number of fragments at the source and recombines them at the
destination. Each fragment is sent as a single rendezvous stage. This option
specifies the maximum length of each fragment. The default is 262144 bytes.
For more information on tunable options, type:
$ mpirun -h