User`s guide

Cray XMT™ Programming Environment User’s Guide

After recompiling this code with automatic scalar replacement enabled, the compiler

is able to transform the foobar2 routine into something that resembles the

following:

myTwoInts foobar2(myTwoInts t, int n, int * restrict foo) {

__tmp_t_i = t.i;

for (int i = 0; i < n; i++) {

__tmp_t_i += foo[i];

}

t.i = __tmp_t_i;

return t;

}

Note that the compiler does not bother creating a temporary variable for the unused

field j.

After this transformation, the compiler is better able to analyze the dependencies in

the loop and to determine that the loop can be safely parallelized as a reduction. This

can be seen in the canal report of the recompiled code:

| myTwoInts foobar2(myTwoInts t, int n, int * restrict foo) {

** scalar replacing t

| for (int i = 0; i < n; i++) { 18 P:$

18 P:$ | t.i += foo[i];

** reduction moved out of 1 loop

| return t;

Scalar replacement of aggregates can enable parallelization of many additional loops.

However, it can also add additional memory references which can adversely affect

performance. For this reason, the compiler performs scalar replacement only when

requested by the programmer. Automatic scalar replacement of aggregates can be

enabled either by using a command-line flag at compile time, or by using pragmas

in your code. If you compile a file with the -scalar_replacement flag, the

compiler will automatically attempt to perform scalar replacement on any aggregates

that it can prove are safely replaceable unless those aggregates have been marked with

an mta no replace pragma. (See Semantic Assertions on page 125.) You can

use the noalias pragmas and restrict type qualifiers as needed to indicate to

the compiler that certain aggregates, or pointers to aggregates, are safe to replace.

96 S–2479–20