Technical data

110
Chapter 6: Compiling and Debugging Parallel Fortran
In addition to the loops, the prole shows the special routines that actually
do the multiprocessing. The mp_simple_sched routine is the synchronizer
and controller. Slave threads wait for work in the routine
mp_slave_wait_for_work. The less time they wait, the more time they work.
This gives a rough estimate of how parallel the program is.
Parallel Programming Exercise on page 119 contains several examples of
proling output and how to use the information it provides.
Debugging Parallel Fortran
This section presents some standard techniques to assist in debugging a
parallel program.
General Debugging Hints
Debugging a multiprocessed program is much harder than debugging
a single-processor program. For this reason, do as much debugging as
possible on the single-processor version.
Try to isolate the problem as much as possible. Ideally, try to reduce the
problem to a single C$DOACROSS loop.
Before debugging a multiprocessed program, change the order of the
iterations on the parallel DO loop on a single-processor version. If the
loop can be multiprocessed, then the iterations can execute in any order
and produce the same answer. If the loop cannot be multiprocessed,
changing the order frequently causes the single-processor version to
fail, and standard single-process debugging techniques can be used to
nd the problem.
Once you have narrowed the bug to a single le, use g mp_keep to
save debugging information and to save the le containing the
multiprocessed DO loop Fortran code that has been moved to a
subroutine. mp_keep will store the compiler-generated subroutines in
the following le name:
$TMPDIR/P<user_subroutine_name>_<machine_name><pid>
If you do not set $TMPDIR, /tmp is used.