Technical data
90
Chapter 5: Fortran Enhancements for Multiprocessors
For example,
c$doacross local(1), REDUCTION(asum, AMAX, AMIN)
do i = 1,N
big_sum = big_sum + a(i)
big_prod = big_prod * a(i)
big_min = min(big_min, a(i))
big_max = max(big_max, a(i)
end do
One further reduction is noteworthy.
DO I = 1, N
TOTAL = 0.0
DO J = 1, M
TOTAL = TOTAL + A(J)
END DO
B(I) = C(I) * TOTAL
END DO
Initially, it may look as if the reduction in the inner loop needs to be rewritten
in a parallel form. However, look at the outer I loop. Although TOTAL
cannot be made a LOCAL variable in the inner loop, it fulfills the criteria for
a LOCAL variable in the outer loop: the value of TOTAL in each iteration of
the outer loop does not depend on the value of TOTAL in any other iteration
of the outer loop. Thus, you do not have to rewrite the loop; you can
parallelize this reduction on the outer I loop, making TOTAL and J local
variables.
Work Quantum
A certain amount of overhead is associated with multiprocessing a loop. If
the work occurring in the loop is small, the loop can actually run slower by
multiprocessing than by single processing. To avoid this, make the amount
of work inside the multiprocessed region as large as possible.










