Product specifications
Table Of Contents
- Table of Contents
- 1 Introduction
- 2 Feature Overview
- 3 Step-by-Step Cluster Setup and MPI Usage Checklists
- 4 InfiniPath Cluster Setup and Administration
- Introduction
- Installed Layout
- Memory Footprint
- BIOS Settings
- InfiniPath and OpenFabrics Driver Overview
- OpenFabrics Drivers and Services Configuration and Startup
- Other Configuration: Changing the MTU Size
- Managing the InfiniPath Driver
- More Information on Configuring and Loading Drivers
- Performance Settings and Management Tips
- Host Environment Setup for MPI
- Checking Cluster and Software Status
- 5 Using QLogic MPI
- Introduction
- Getting Started with MPI
- QLogic MPI Details
- Use Wrapper Scripts for Compiling and Linking
- Configuring MPI Programs for QLogic MPI
- To Use Another Compiler
- Process Allocation
- mpihosts File Details
- Using mpirun
- Console I/O in MPI Programs
- Environment for Node Programs
- Environment Variables
- Running Multiple Versions of InfiniPath or MPI
- Job Blocking in Case of Temporary InfiniBand Link Failures
- Performance Tuning
- MPD
- QLogic MPI and Hybrid MPI/OpenMP Applications
- Debugging MPI Programs
- QLogic MPI Limitations
- 6 Using Other MPIs
- A mpirun Options Summary
- B Benchmark Programs
- C Integration with a Batch Queuing System
- D Troubleshooting
- Using LEDs to Check the State of the Adapter
- BIOS Settings
- Kernel and Initialization Issues
- OpenFabrics and InfiniPath Issues
- Stop OpenSM Before Stopping/Restarting InfiniPath
- Manual Shutdown or Restart May Hang if NFS in Use
- Load and Configure IPoIB Before Loading SDP
- Set $IBPATH for OpenFabrics Scripts
- ifconfig Does Not Display Hardware Address Properly on RHEL4
- SDP Module Not Loading
- ibsrpdm Command Hangs when Two Host Channel Adapters are Installed but Only Unit 1 is Connected to the Switch
- Outdated ipath_ether Configuration Setup Generates Error
- System Administration Troubleshooting
- Performance Issues
- QLogic MPI Troubleshooting
- Mixed Releases of MPI RPMs
- Missing mpirun Executable
- Resolving Hostname with Multi-Homed Head Node
- Cross-Compilation Issues
- Compiler/Linker Mismatch
- Compiler Cannot Find Include, Module, or Library Files
- Problem with Shell Special Characters and Wrapper Scripts
- Run Time Errors with Different MPI Implementations
- Process Limitation with ssh
- Number of Processes Exceeds ulimit for Number of Open Files
- Using MPI.mod Files
- Extending MPI Modules
- Lock Enough Memory on Nodes When Using a Batch Queuing System
- Error Creating Shared Memory Object
- gdb Gets SIG32 Signal Under mpirun -debug with the PSM Receive Progress Thread Enabled
- General Error Messages
- Error Messages Generated by mpirun
- MPI Stats
- E Write Combining
- F Useful Programs and Files
- G Recommended Reading
- Glossary
- Index

C–Integration with a Batch Queuing System
Using SLURM for Batch Queuing
C-4 IB6054601-00 H
S
The sort | uniq -c component determines the number of times each unique
line was printed. The awk command converts the result into the mpihosts file
format used by mpirun. Each line consists of a node name, a colon, and the
number of processes to start on that node.
Simple Process Management
At this point, the script has enough information to be able to run an MPI program.
The next step is to start the program when the batch system is ready, and notify
the batch system when the job completes. This is done in the final part of
batch_mpirun, for example:
mpirun -np $np -m $mpihosts_file "$mpi_prog" $@
exit_code=$?
scancel ${SLURM_JOBID}
rm -f $mpihosts_file
exit $exit_code
Clean Termination of MPI Processes
The InfiniPath software normally ensures clean termination of all MPI programs
when a job ends, but in some rare circumstances an MPI process may remain
alive, and potentially interfere with future MPI jobs. To avoid this problem, run a
script before and after each batch job that kills all unwanted processes. QLogic
does not provide such a script, but it is useful to know how to find out which
processes on a node are using the QLogic interconnect. The easiest way to do
this is with the fuser command, which is normally installed in /sbin.
Run these commands as a root user to ensure that all processes are reported.
# /sbin/fuser -v /dev/ipath
/dev/ipath: 22648m 22651m
In this example, processes 22648 and 22651 are using the QLogic interconnect. It
is also possible to use this command (as a root user):
# lsof /dev/ipath
This command displays a list of processes using InfiniPath. Additionally, to get all
processes, including stats programs, ipath_sma, diags, and others, run the
program in this way:
# /sbin/fuser -v /dev/ipath*
lsof can also take the same form:
# lsof /dev/ipath*
NOTE:
This is one of two formats that the file can use. See “Console I/O in MPI
Programs” on page 5-17 for more information.