Optimizing Itanium-Based Applications (May 2011)

ManualsBrandsHP ManualsSoftwareC/aC++ Software for HP-UX

On Itanium, the benefit of forming these contractions can be significant. Contractions can be enabled

and disabled in different blocks of code using the FP_CONTRACT pragma. FP_CONTRACT OFF

overrides any prior pragma or +Ofltacc=strict option. FP_CONTRACT ON has no effect other

than undoing a prior FP_CONTRACT OFF, and is overridden by +Ofltacc=strict.

+Ofltacc=limited enables a small number of other value-changing optimizations in addition to

the contractions. These optimizations can prevent the propagation of Not-a-Numbers (NaNs), infinities,

and the sign of zero. For example, performing the optimization of 0.0*x => 0.0 will prevent the

propagation of NaN, infinities, and the sign of zero if x is a Nan, infinity, or negative number.

The most aggressive floating-point optimizations are enabled with +Ofltacc=relaxed (or its

equivalent +Onofltacc). For example, faster and more efficient floating-point divide sequences are

enabled under relaxed accuracy.

Additionally, optimizations that reassociate floating-point computation are enabled with

+Ofltacc=relaxed. For example, the sum reduction optimization, which hides floating-point add

latency by computing partial sums, can be enabled in C or C++ with +Ofltacc=relaxed. It also

enables loop optimizations such as fusion, distribution, blocking, unroll and jam, and interchange in

loops with floating-point accesses. For Fortran, these optimizations are already enabled because

reassociation that does not violate explicit parentheses is always legal.

Finally, +Ofltacc=relaxed implies the +Ocxlimitedrange option (described below).

+O[no]sumreduction

Will [dis]allow the sum reduction optimization, regardless of the floating-point accuracy setting. This

can be used to enable optimization of sum reductions via the computation of partial sums for C or C++

without having to specify the more aggressive +Ofltacc=relaxed, which is less safe. Conversely,

+Onosumreduction can be used to disallow the sum reduction optimization under a floating-point

accuracy setting where it is normally allowed (e.g. by default for Fortran, where the language standard

allows this type of reassociation).

+O[no]cxlimitedrange

(default +Onocxlimitedrange for C, +Ocxlimitedrange for Fortran)

#pragma STDC CX_LIMITED_RANGE [ON/OFF/DEFAULT]

You can use this option to obtain faster, complex arithmetic sequences when an application does not

rely on out-of-range floating point values. This option indicates whether out-of-range floating point

values (for example, NaNs and infinities) can occur and must be preserved. With

+Ocxlimitedrange, out-of-range floating-point values might not be preserved. Enabling the limited

range switch results in faster complex arithmetic sequences. The CX_LIMITED_RANGE pragma

enables limited range behavior for specific blocks of code, whereas the option is global.

CX_LIMITED_RANGE ON overrides +Onocxlimitedrange, and CX_LIMITED_RANGE OFF

has no effect except to undo a prior CX_LIMITED_RANGE ON or +Ocxlimitedrange.

+O[no]fenvaccess (default +Onofenvaccess)

#pragma STDC FENV_ACCESS [ON/OFF/DEFAULT]

#pragma FLOAT_TRAPS_ON

+Ofenvaccess disables any optimizations that might affect behavior under non-default

floating-point modes (for example, alternate rounding directions or trap enables) or where floating-point

exception flags are queried. It can also be enabled locally using either the FENV_ACCESS or

FLOAT_TRAPS_ON pragmas. FENV_ACCESS ON and FLOAT_TRAPS_ON override

+Onofenvaccess. FENV_ACCESS OFF has no effect other than to undo a prior FENV_ACCESS

ON, FLOAT_TRAPS_ON, or +Ofenvaccess. Enabling fenvaccess, for example, prevents dead

code elimination of instructions that can raise exceptions, results in longer floating-point-to-integer

conversion sequences that explicitly check for out-of-range results, and results in longer floating-point

division sequences.