Technical data

122

Chapter 6: Compiling and Debugging Parallel Fortran

FORCE(I,1) = FORCE(I,1) + WEIGHT(I)

FORCE(I,2) = FORCE(I,2) + WEIGHT(I)

FORCE(I,3) = FORCE(I,3) + WEIGHT(I)

C ... AND THE FORCE OF THIS ATOM ACTING ON THE

C NEARBY ATOM

FORCE(J,1) = FORCE(J,1) + WEIGHT(J)

FORCE(J,2) = FORCE(J,2) + WEIGHT(J)

FORCE(J,3) = FORCE(J,3) + WEIGHT(J)

END IF

END DO

RETURN

END

Step 3: Analyze

It is better to parallelize the outer loop, if possible, to enclose the most work.

To do this, analyze the variable usage. The simplest and best way is to use

the Silicon Graphics POWER Fortran Accelerator™ (PFA). If you do not

have access to this tool, you must examine each variable by hand.

Data dependence occurs when the same location is written to and read.

Therefore, any variables not modiﬁed inside the loop can be dismissed.

Because they are read only, they can be made SHARE variables and do not

prevent parallelization. In the example, NUM_ATOMS, ATOMS,

THRESHOLD_SQ, and WEIGHT are only read, so they can be declared

SHARE.

Next, I and J can be LOCAL variables. Perhaps not so easily seen is that

DIST_SQ can also be a LOCAL variable. Even though it is an array, the

values stored in it do not carry from one iteration to the next; it is simply a

vector of temporaries.

The variable FORCE is the crux of the problem. The iterations of FORCE(I,*)

are all right. Because each iteration of the outer loop gets a different value of

I, each iteration uses a different FORCE(I,*). If this was the only use of

FORCE, we could make FORCE a SHARE variable. However, FORCE(J,*)

prevents this. In each iteration of the inner loop, something may be added to