Technical data

110

Chapter 6: Compiling and Debugging Parallel Fortran

In addition to the loops, the proﬁle shows the special routines that actually

do the multiprocessing. The mp_simple_sched routine is the synchronizer

and controller. Slave threads wait for work in the routine

mp_slave_wait_for_work. The less time they wait, the more time they work.

This gives a rough estimate of how parallel the program is.

“Parallel Programming Exercise” on page 119 contains several examples of

proﬁling output and how to use the information it provides.

Debugging Parallel Fortran

This section presents some standard techniques to assist in debugging a

parallel program.

General Debugging Hints

• Debugging a multiprocessed program is much harder than debugging

a single-processor program. For this reason, do as much debugging as

possible on the single-processor version.

• Try to isolate the problem as much as possible. Ideally, try to reduce the

problem to a single C$DOACROSS loop.

• Before debugging a multiprocessed program, change the order of the

iterations on the parallel DO loop on a single-processor version. If the

loop can be multiprocessed, then the iterations can execute in any order

and produce the same answer. If the loop cannot be multiprocessed,

changing the order frequently causes the single-processor version to

fail, and standard single-process debugging techniques can be used to

ﬁnd the problem.

• Once you have narrowed the bug to a single ﬁle, use –g –mp_keep to

save debugging information and to save the ﬁle containing the

multiprocessed DO loop Fortran code that has been moved to a

subroutine. –mp_keep will store the compiler-generated subroutines in

the following ﬁle name:

$TMPDIR/P<user_subroutine_name>_<machine_name><pid>

If you do not set $TMPDIR, /tmp is used.