Product specifications
Table Of Contents
- Table of Contents
- 1 Introduction
- 2 Feature Overview
- 3 Step-by-Step Cluster Setup and MPI Usage Checklists
- 4 InfiniPath Cluster Setup and Administration
- Introduction
- Installed Layout
- Memory Footprint
- BIOS Settings
- InfiniPath and OpenFabrics Driver Overview
- OpenFabrics Drivers and Services Configuration and Startup
- Other Configuration: Changing the MTU Size
- Managing the InfiniPath Driver
- More Information on Configuring and Loading Drivers
- Performance Settings and Management Tips
- Host Environment Setup for MPI
- Checking Cluster and Software Status
- 5 Using QLogic MPI
- Introduction
- Getting Started with MPI
- QLogic MPI Details
- Use Wrapper Scripts for Compiling and Linking
- Configuring MPI Programs for QLogic MPI
- To Use Another Compiler
- Process Allocation
- mpihosts File Details
- Using mpirun
- Console I/O in MPI Programs
- Environment for Node Programs
- Environment Variables
- Running Multiple Versions of InfiniPath or MPI
- Job Blocking in Case of Temporary InfiniBand Link Failures
- Performance Tuning
- MPD
- QLogic MPI and Hybrid MPI/OpenMP Applications
- Debugging MPI Programs
- QLogic MPI Limitations
- 6 Using Other MPIs
- A mpirun Options Summary
- B Benchmark Programs
- C Integration with a Batch Queuing System
- D Troubleshooting
- Using LEDs to Check the State of the Adapter
- BIOS Settings
- Kernel and Initialization Issues
- OpenFabrics and InfiniPath Issues
- Stop OpenSM Before Stopping/Restarting InfiniPath
- Manual Shutdown or Restart May Hang if NFS in Use
- Load and Configure IPoIB Before Loading SDP
- Set $IBPATH for OpenFabrics Scripts
- ifconfig Does Not Display Hardware Address Properly on RHEL4
- SDP Module Not Loading
- ibsrpdm Command Hangs when Two Host Channel Adapters are Installed but Only Unit 1 is Connected to the Switch
- Outdated ipath_ether Configuration Setup Generates Error
- System Administration Troubleshooting
- Performance Issues
- QLogic MPI Troubleshooting
- Mixed Releases of MPI RPMs
- Missing mpirun Executable
- Resolving Hostname with Multi-Homed Head Node
- Cross-Compilation Issues
- Compiler/Linker Mismatch
- Compiler Cannot Find Include, Module, or Library Files
- Problem with Shell Special Characters and Wrapper Scripts
- Run Time Errors with Different MPI Implementations
- Process Limitation with ssh
- Number of Processes Exceeds ulimit for Number of Open Files
- Using MPI.mod Files
- Extending MPI Modules
- Lock Enough Memory on Nodes When Using a Batch Queuing System
- Error Creating Shared Memory Object
- gdb Gets SIG32 Signal Under mpirun -debug with the PSM Receive Progress Thread Enabled
- General Error Messages
- Error Messages Generated by mpirun
- MPI Stats
- E Write Combining
- F Useful Programs and Files
- G Recommended Reading
- Glossary
- Index

4–InfiniPath Cluster Setup and Administration
Performance Settings and Management Tips
IB6054601-00 H 4-23
A
Use a PCIe Max Read Request size of at least 512 bytes with the
QLE7240 and QLE7280. QLE7240 and QLE7280 adapters can support
sizes from 128 bytes to 4096 byte in powers of two. This value is typically
set by the BIOS.
Use PCIe Max Payload size of 256, where available, with the QLE7240
and QLE7280. The QLE7240 and QLE7280 adapters can support 128, 256,
or 512 bytes. This value is typically set by the BIOS as the minimum value
supported both by the PCIe card and the PCIe root complex.
Make sure that write combining is enabled. The x86 Page Attribute Table
(PAT) mechanism that allocates Write Combining (WC) mappings for the
PIO buffers has been added and is now the default. If PAT is unavailable or
PAT initialization fails for some reason, the code will generate a message in
the log and fall back to the MTRR mechanism. See “Write Combining” on
page E-1 for more information.
Check the PCIe bus width. If slots have a smaller electrical width than
mechanical width, lower than expected performance may occur. Use this
command to check PCIe Bus width:
$ ipath_control -iv
This command also shows the link speed.
Remove Unneeded Services
The cluster administrator can enhance application performance by minimizing the
set of system services running on the compute nodes. Since these are presumed
to be specialized computing appliances, they do not need many of the service
daemons normally running on a general Linux computer.
Following are several groups constituting a minimal necessary set of services.
These are all services controlled by chkconfig. To see the list of services that
are enabled, use the command:
$ /sbin/chkconfig --list | grep -w on
Basic network services are:
network
ntpd
syslog
xinetd
sshd
For system housekeeping, use:
anacron
atd
crond