User guide
C–Integration with a Batch Queuing System
Clean-up PSM Shared Memory Files
C-2 IB0054606-02 A
This command displays a list of processes using InfiniPath. Additionally, to get all
processes, including stats programs, ipath_sma, diags, and others, run the
program in this way:
# /sbin/fuser -v /dev/ipath*
lsof can also take the same form:
# lsof /dev/ipath*
The following command terminates all processes using the QLogic interconnect:
# /sbin/fuser -k /dev/ipath
For more information, see the man pages for fuser(1) and lsof(8).
Clean-up PSM Shared Memory Files
In some cases if a PSM job terminates abnormally, such as with a segmentation
fault, there could be POSIX shared memory files leftover in the /dev/shm directory.
The file is owned by the user and in permission -rwx------, it can be removed
either by the user or by root.
PSM relies on the MPI implementation to cleanup after abnormal job termination.
In cases where this does not occur there may be leftover share memory files. To
clean up the system, create, save, and run the following PSM SHM cleanup script
as root on each node. Either logon to the node, or run remote using pdsh/ssh.
NOTE
Hard and explicit program termination, such as kill -9 on the mpirun
Process ID (PID), may result in Open MPI being unable to guarantee that
the /dev/shm shared memory file is properly removed. As many stale files
accumulate on each node, an error message can appear at startup:
node023:6.Error creating shared memory object in
shm_open(/dev/shm may have stale shm files that need
to be removed):
If this occurs, refer to Clean-up PSM Shared Memory Files for information.