User`s guide
Cray XMT™ Programming Environment User’s Guide
After recompiling this code with automatic scalar replacement enabled, the compiler
is able to transform the foobar2 routine into something that resembles the
following:
myTwoInts foobar2(myTwoInts t, int n, int * restrict foo) {
__tmp_t_i = t.i;
for (int i = 0; i < n; i++) {
__tmp_t_i += foo[i];
}
t.i = __tmp_t_i;
return t;
}
Note that the compiler does not bother creating a temporary variable for the unused
field j.
After this transformation, the compiler is better able to analyze the dependencies in
the loop and to determine that the loop can be safely parallelized as a reduction. This
can be seen in the canal report of the recompiled code:
| myTwoInts foobar2(myTwoInts t, int n, int * restrict foo) {
** scalar replacing t
| for (int i = 0; i < n; i++) { 18 P:$
18 P:$ | t.i += foo[i];
** reduction moved out of 1 loop
|}
| return t;
|}
Scalar replacement of aggregates can enable parallelization of many additional loops.
However, it can also add additional memory references which can adversely affect
performance. For this reason, the compiler performs scalar replacement only when
requested by the programmer. Automatic scalar replacement of aggregates can be
enabled either by using a command-line flag at compile time, or by using pragmas
in your code. If you compile a file with the -scalar_replacement flag, the
compiler will automatically attempt to perform scalar replacement on any aggregates
that it can prove are safely replaceable unless those aggregates have been marked with
an mta no replace pragma. (See Semantic Assertions on page 125.) You can
use the noalias pragmas and restrict type qualifiers as needed to indicate to
the compiler that certain aggregates, or pointers to aggregates, are safe to replace.
96 S–2479–20