Product specifications
Table Of Contents
- Table of Contents
- 1 Introduction
- 2 Feature Overview
- 3 Step-by-Step Cluster Setup and MPI Usage Checklists
- 4 InfiniPath Cluster Setup and Administration
- Introduction
- Installed Layout
- Memory Footprint
- BIOS Settings
- InfiniPath and OpenFabrics Driver Overview
- OpenFabrics Drivers and Services Configuration and Startup
- Other Configuration: Changing the MTU Size
- Managing the InfiniPath Driver
- More Information on Configuring and Loading Drivers
- Performance Settings and Management Tips
- Host Environment Setup for MPI
- Checking Cluster and Software Status
- 5 Using QLogic MPI
- Introduction
- Getting Started with MPI
- QLogic MPI Details
- Use Wrapper Scripts for Compiling and Linking
- Configuring MPI Programs for QLogic MPI
- To Use Another Compiler
- Process Allocation
- mpihosts File Details
- Using mpirun
- Console I/O in MPI Programs
- Environment for Node Programs
- Environment Variables
- Running Multiple Versions of InfiniPath or MPI
- Job Blocking in Case of Temporary InfiniBand Link Failures
- Performance Tuning
- MPD
- QLogic MPI and Hybrid MPI/OpenMP Applications
- Debugging MPI Programs
- QLogic MPI Limitations
- 6 Using Other MPIs
- A mpirun Options Summary
- B Benchmark Programs
- C Integration with a Batch Queuing System
- D Troubleshooting
- Using LEDs to Check the State of the Adapter
- BIOS Settings
- Kernel and Initialization Issues
- OpenFabrics and InfiniPath Issues
- Stop OpenSM Before Stopping/Restarting InfiniPath
- Manual Shutdown or Restart May Hang if NFS in Use
- Load and Configure IPoIB Before Loading SDP
- Set $IBPATH for OpenFabrics Scripts
- ifconfig Does Not Display Hardware Address Properly on RHEL4
- SDP Module Not Loading
- ibsrpdm Command Hangs when Two Host Channel Adapters are Installed but Only Unit 1 is Connected to the Switch
- Outdated ipath_ether Configuration Setup Generates Error
- System Administration Troubleshooting
- Performance Issues
- QLogic MPI Troubleshooting
- Mixed Releases of MPI RPMs
- Missing mpirun Executable
- Resolving Hostname with Multi-Homed Head Node
- Cross-Compilation Issues
- Compiler/Linker Mismatch
- Compiler Cannot Find Include, Module, or Library Files
- Problem with Shell Special Characters and Wrapper Scripts
- Run Time Errors with Different MPI Implementations
- Process Limitation with ssh
- Number of Processes Exceeds ulimit for Number of Open Files
- Using MPI.mod Files
- Extending MPI Modules
- Lock Enough Memory on Nodes When Using a Batch Queuing System
- Error Creating Shared Memory Object
- gdb Gets SIG32 Signal Under mpirun -debug with the PSM Receive Progress Thread Enabled
- General Error Messages
- Error Messages Generated by mpirun
- MPI Stats
- E Write Combining
- F Useful Programs and Files
- G Recommended Reading
- Glossary
- Index

IB6054601-00 H Glossary-5
Glossary
RDMA — uDAPL
A
RDMA
Stands for Remote Direct Memory Access.
A communications protocol that enables
data transmission from the memory of one
computer to the memory of another
without involving the CPU. The most
common form of RDMA is over InfiniBand.
RPM
Stands for Red Hat Package Manager. A
tool for packaging, installing, and
managing software for Linux distributions.
SDP
Stands for Sockets Direct Protocol. An
InfiniBand-specific upper layer protocol. It
defines a standard wire protocol to support
stream sockets networking over Infini-
Band.
SRP
Stands for SCSI RDMA Protocol. The
implementation of this protocol is under
development for utilizing block storage
devices over an InfiniBand fabric.
SM
Stands for Subnet Manager. A subnet
contains a master subnet manager that is
responsible for network initialization
(topology discovery), configuration, and
maintenance. The SM discovers and
configures all the reachable nodes in the
InfiniBand fabric. It discovers them at
switch startup, and continues monitoring
changes in physical network connectivity
and topology. It is responsible for
assigning Local IDentifiers, called LIDs, to
the visible nodes. It also handles multicast
group setup. When the network contains
multiple managed switches, they negotiate
among themselves as to which one
controls SM. The SM communicates with
the Subnet Management Agents (SMAs)
that exist on all nodes in a cluster.
SMA
Stands for Subnet Management Agent.
SMAs exist on all nodes, and are respon-
sible for interacting with the subnet
manager to configure an individual node
and report node parameters and statistics.
subnet
A single InfiniBand network.
switch
Connects host channel adapters and
target channel adapters. Packets are
forwarded from one port to another within
the switch, based on the LID of the packet.
The fabric is the connected group of
switches.
target channel adapter
Target channel adapters are for I/O nodes,
such as shared storage devices.
TCP
Stands for Transmission Control Protocol.
One of the core protocols of the Internet
protocol suite. TCP is a transport mecha-
nism that ensures that data arrives
complete and in order.
TID
Stands for Token ID. A method of identi-
fying a memory region. Part of the QLogic
hardware.
UD
Stands for Unreliable Datagram. A trans-
port protocol used by InfiniBand.
uDAPL
Stands for user Direct Access Provider
Library. uDAPL is the user space imple-
mentation of the DAPL protocol.