HP-MPI User's Guide (11th Edition)
Debugging and troubleshooting
Debugging HP-MPI applications
Chapter 6200
NOTE When attaching to a running MPI application that was started using
appfiles, you should attach to the MPI daemon process to enable
debugging of all the MPI ranks in the application. You can identify the
daemon process as the one at the top of a hierarchy of MPI jobs (the
daemon also usually has the lowest PID among the MPI jobs).
Limitations
The following limitations apply to using TotalView with HP-MPI
applications:
1. All the executable files in your multihost MPI application must
reside on your local machine, that is, the machine on which you start
TotalView. Refer to “TotalView multihost example” on page 200 for
details about requirements for directory structure and file locations.
2. TotalView sometimes displays extra HP-UX threads that have no
useful debugging information. These are kernel threads that are
created to deal with page and protection faults associated with
one-copy operations that HP-MPI uses to improve performance. You
can ignore these kernel threads during your debugging session.
To improve performance, HP-MPI supports a process-to-process,
one-copy messaging approach. This means that one process can
directly copy a message into the address space of another process.
Because of this process-to-process bcopy (p2p_bcopy)
implementation, a kernel thread is created for each process that has
p2p_bcopy enabled. This thread deals with page and protection
faults associated with the one-copy operation.
TotalView multihost example
The following example demonstrates how to debug a typical HP-MPI
multihost application using TotalView, including requirements for
directory structure and file locations.
The MPI application is represented by an appfile, named my_appfile,
which contains the following two lines:
-h local_host -np 2 /path/to/program1
-h remote_host -np 2 /path/to/program2