Technical data
Cray Standard C/C++ Reference Manual
contains the taskloop loop. This allows a mechanism to exploit parallelism in
loops that contain reduction computations. The endloop directive can appear
only in a parallel region. The format of the endloop directive is as follows:
#pragma _CRI endloop
In the following example, a parallel region is defined that uses a
taskloop/endloop pair and a guarded region to implement a reduction
computation.
sum = 0;
big = -1;
#pragma _CRI parallel private(i,xsum,xbig) shared(aa,bb,cc,sum,big)
xsum = 0;
xbig = -1;
#pragma _CRI taskloop vector
for (i = 0; i < 2000; i++) {
...
xsum = xsum + aa[i]*(bb[i]-cc[aa[i]]);
xbig = max(abs(aa[i]*bb[i]), xbig);
...
}
#pragma _CRI guard/* protect the update of sum and big */
sum = sum + xsum;
big = max(xbig, big);
#pragma _CRI endguard
#pragma _CRI endloop
...
/* ensure that all processors have contributed to */
/* the sum; all processors are held here until */
/* all contributions are in, ensuring that the */
/* value of sum and big will be correct for their */
/* later use within the parallel region. */
...
if (sum > 1000.0) {
...
}
#pragma _CRI endparallel
In this example, the guarded region protects the update of sum and big,so
that each processor does its own update without interference from the others.
72 S–2179–36










