Technical data
122
Chapter 6: Compiling and Debugging Parallel Fortran
FORCE(I,1) = FORCE(I,1) + WEIGHT(I)
FORCE(I,2) = FORCE(I,2) + WEIGHT(I)
FORCE(I,3) = FORCE(I,3) + WEIGHT(I)
C
C ... AND THE FORCE OF THIS ATOM ACTING ON THE
C NEARBY ATOM
C
FORCE(J,1) = FORCE(J,1) + WEIGHT(J)
FORCE(J,2) = FORCE(J,2) + WEIGHT(J)
FORCE(J,3) = FORCE(J,3) + WEIGHT(J)
END IF
END DO
END DO
RETURN
END
Step 3: Analyze
It is better to parallelize the outer loop, if possible, to enclose the most work.
To do this, analyze the variable usage. The simplest and best way is to use
the Silicon Graphics POWER Fortran Accelerator™ (PFA). If you do not
have access to this tool, you must examine each variable by hand.
Data dependence occurs when the same location is written to and read.
Therefore, any variables not modified inside the loop can be dismissed.
Because they are read only, they can be made SHARE variables and do not
prevent parallelization. In the example, NUM_ATOMS, ATOMS,
THRESHOLD_SQ, and WEIGHT are only read, so they can be declared
SHARE.
Next, I and J can be LOCAL variables. Perhaps not so easily seen is that
DIST_SQ can also be a LOCAL variable. Even though it is an array, the
values stored in it do not carry from one iteration to the next; it is simply a
vector of temporaries.
The variable FORCE is the crux of the problem. The iterations of FORCE(I,*)
are all right. Because each iteration of the outer loop gets a different value of
I, each iteration uses a different FORCE(I,*). If this was the only use of
FORCE, we could make FORCE a SHARE variable. However, FORCE(J,*)
prevents this. In each iteration of the inner loop, something may be added to










