Technical data

Cray Standard C/C++ Reference Manual
declared in that block will be squandered. Neither of these conditions are
detected by the compiler.
3.10.10 unroll Directive
Scope: Local
The unrolling directive allows the user to control unrolling for individual loops.
Loop unrolling can improve program performance by revealing cross-iteration
memory optimization opportunities such as read-after-write and read-after-read.
The effects of loop unrolling also include:
Improved loop scheduling by increasing basic block size
Reduced loop overhead
Improved chances for cache hits
The format for this compiler directive is as follows:
#pragma _CRI unroll [n]
The n argument specifies the total number of loop body copies to be generated. n
must be in the range of 2 through 63.
If you do not specify a value for n, the compiler attempts to determine the
number of copies to generate based on the number of statements in the loop nest.
!
Caution: If placed prior to a noninnermost loop, the unroll directive asserts
that the following loop has no dependencies across iterations of that loop. If
dependencies exist, incorrect code could be generated.
The unroll compiler directive can be used only on loops with iteration counts
that can be calculated before entering the loop. If unroll is specified on a
loop that is not the innermost loop in a loop nest, the inner loops must be
nested perfectly. That is, all loops in the nest can contain only one loop, and
the innermost loop can contain work.
The compiler can be directed to attempt to unroll all loops generated for the
program with the -h unroll command line option (see Section 2.14.2, page 25).
On UNICOS/mk systems, the amount of unrolling specified on the unroll
directive overrides those chosen by the compiler when the -h unroll command
98 S217936