HP aC++/HP C Programmer's Guide (B3901-90036; A.06.26; September 2011)
+Oprefetch_latency
+Oprefetch_latency=cycles
The +Oprefetch_latency option applies to loops for which the compiler generates
data prefetch instructions. cycles represents the number of cycles for a data cache
miss. For a given loop, the compiler divides cycles by the estimated loop length to
arrive at the number of loop iterations for which to generate advanced prefetches.
cycles must be in the range of 0 to 10000. A value of 0 instructs the compiler to use
the default value, which is 480 cycles for loops containing floating-point accesses and
150 cycles for loops that do not contain any floating-point accesses.
For tuning purposes, it is recommended that users measure their application’s performance
using a few different prefetch latency settings to determine the optimal value. Some
floating-point codes may benefit by increasing the distance to 960. Parallel applications
frequently benefit from a shorter prefetch distance of 150.
+O[no]preserved_fpregs
+O[no]preserved_fpregs
The +O[no]preserved_fprefs option specifies whether the compiler is allowed [not
allowed] to make use of the preserved subset of the floating-point register file as defined
by the Itanium runtime architecture.
The default is +Opreserved_fpregs.
+O[no]rotating_fpregs
+O[no]rotating_fpregs
The +O[no]rotating_fpregs option specifies whether the compiler is allowed [not
allowed] to make use of the rotating subset of the floating-point register file.
The default is +Orotating_fpregs.
+O[no]sumreduction
+O[no]sumreduction
This option enables [disables] sum reduction optimization. It allows the compiler to
compute partial sums to allow faster computations. It is not technically legal to do this in
C or C++ because of floating-point accuracy issues. This option is useful if an application
cannot use the +Onofltacc option but wants sum reduction to be performed.
When sum reduction optimization is enabled, the compiler may evaluate intermediate
partial sums of float or double precision terms using (wider) extended precision, which
reduces variation in the result caused by different optimization strategies and generally
produces a more accurate result.
Floating-Point Processing Options 51