Optimizing Itanium-Based Applications (May 2011)

12
On Itanium, the benefit of forming these contractions can be significant. Contractions can be enabled
and disabled in different blocks of code using the FP_CONTRACT pragma. FP_CONTRACT OFF
overrides any prior pragma or +Ofltacc=strict option. FP_CONTRACT ON has no effect other
than undoing a prior FP_CONTRACT OFF, and is overridden by +Ofltacc=strict.
+Ofltacc=limited enables a small number of other value-changing optimizations in addition to
the contractions. These optimizations can prevent the propagation of Not-a-Numbers (NaNs), infinities,
and the sign of zero. For example, performing the optimization of 0.0*x => 0.0 will prevent the
propagation of NaN, infinities, and the sign of zero if x is a Nan, infinity, or negative number.
The most aggressive floating-point optimizations are enabled with +Ofltacc=relaxed (or its
equivalent +Onofltacc). For example, faster and more efficient floating-point divide sequences are
enabled under relaxed accuracy.
Additionally, optimizations that reassociate floating-point computation are enabled with
+Ofltacc=relaxed. For example, the sum reduction optimization, which hides floating-point add
latency by computing partial sums, can be enabled in C or C++ with +Ofltacc=relaxed. It also
enables loop optimizations such as fusion, distribution, blocking, unroll and jam, and interchange in
loops with floating-point accesses. For Fortran, these optimizations are already enabled because
reassociation that does not violate explicit parentheses is always legal.
Finally, +Ofltacc=relaxed implies the +Ocxlimitedrange option (described below).
+O[no]sumreduction
Will [dis]allow the sum reduction optimization, regardless of the floating-point accuracy setting. This
can be used to enable optimization of sum reductions via the computation of partial sums for C or C++
without having to specify the more aggressive +Ofltacc=relaxed, which is less safe. Conversely,
+Onosumreduction can be used to disallow the sum reduction optimization under a floating-point
accuracy setting where it is normally allowed (e.g. by default for Fortran, where the language standard
allows this type of reassociation).
+O[no]cxlimitedrange
(default +Onocxlimitedrange for C, +Ocxlimitedrange for Fortran)
#pragma STDC CX_LIMITED_RANGE [ON/OFF/DEFAULT]
You can use this option to obtain faster, complex arithmetic sequences when an application does not
rely on out-of-range floating point values. This option indicates whether out-of-range floating point
values (for example, NaNs and infinities) can occur and must be preserved. With
+Ocxlimitedrange, out-of-range floating-point values might not be preserved. Enabling the limited
range switch results in faster complex arithmetic sequences. The CX_LIMITED_RANGE pragma
enables limited range behavior for specific blocks of code, whereas the option is global.
CX_LIMITED_RANGE ON overrides +Onocxlimitedrange, and CX_LIMITED_RANGE OFF
has no effect except to undo a prior CX_LIMITED_RANGE ON or +Ocxlimitedrange.
+O[no]fenvaccess (default +Onofenvaccess)
#pragma STDC FENV_ACCESS [ON/OFF/DEFAULT]
#pragma FLOAT_TRAPS_ON
+Ofenvaccess disables any optimizations that might affect behavior under non-default
floating-point modes (for example, alternate rounding directions or trap enables) or where floating-point
exception flags are queried. It can also be enabled locally using either the FENV_ACCESS or
FLOAT_TRAPS_ON pragmas. FENV_ACCESS ON and FLOAT_TRAPS_ON override
+Onofenvaccess. FENV_ACCESS OFF has no effect other than to undo a prior FENV_ACCESS
ON, FLOAT_TRAPS_ON, or +Ofenvaccess. Enabling fenvaccess, for example, prevents dead
code elimination of instructions that can raise exceptions, results in longer floating-point-to-integer
conversion sequences that explicitly check for out-of-range results, and results in longer floating-point
division sequences.