Technical data

Cray Standard C/C++ Reference Manual
On UNICOS/mk systems, the compiler can perform "vectorization-like"
optimizations on certain loops. Vector versions of the following functions are
used when the function appears in a vectorizable loop on UNICOS/mk systems:
alog(3m), exp(3m), sqrt(3m), ranf(3m), sin(3m), cos(3c), coss(3m), pow(3c),
and _popcnt(3i). This vectorization is performed using the following process:
1. The loop is stripmined. Stripmining is a single-processor optimization
technique in which arrays and the program loops that reference them
are split into optimally-sized blocks, termed strips. The original loop is
transformed into two nested loops. The inner loop references all data
elements within a single strip, and the outer loop selects the strip to be
addressed in the inner loop. This technique is often performed by the
compiler to maximize the usage of cache memory or as part of vector code
generation.
2. If necessary, a strip of operands is stored in a temporary array. The vector
version of the function is called, which stores the strip of results in a
temporary array.
3. The remainder of the loop is computed using the results from step 2.
The subsections that follow describe the compiler directives used to control
vectorization on UNICOS systems and "vectorization-like" optimizations on
UNICOS/mk systems.
3.7.1 ivdep Directive
Scope: Local
The ivdep directive tells the compiler to ignore vector dependencies for
the loop immediately following the directive. Conditions other than vector
dependencies can inhibit vectorization. If these conditions are satisfactory, the
loop vectorizes. This directive is useful for some loops that contain pointers and
indirect addressing. The format of this directive is as follows:
#pragma _CRI ivdep
The following example illustrates the use of the ivdep compiler directive:
p = a; q = b;
#pragma _CRI ivdep
for (i = 0; i < n; i++) { /* Vectorized */
*p++ = *q++;
}
58 S217936