HP-UX Floating-Point Guide

ManualsBrandsHP ManualsSoftwareHP-UX Performance Tools

171

172

173

174

175

176

177

178

179

180

176 Chapter 7

Performance Tuning

Inefﬁcient Code

use+Oaggressive +Onovectorize

if you want aggressive optimization

without vector calls.

Any ﬁles that were compiled with

+Ovectorize must also be linked

with +Ovectorize (this happens

automatically when the compiler

invokes the linker).

This option can be used at

optimization levels 3 and 4. The

default is +Onovectorize. This

option is valid only when you compile

for PA1.1 and PA2.0 systems.

If your PA2.0 application uses very

large arrays, you may gain

considerable performance beneﬁt

from using +Odataprefetch in

conjunction with +Ovectorize. The

math library contains special

prefetching versions of the vector

routines, which are called if you

specify both options.

Specifying the Architecture Type

All HP 9000 compilers support the +DA option, which speciﬁes a

particular target architecture type. Use of this option causes the

compiler to produce architecture-speciﬁc instructions and calls to special

architecture-speciﬁc run-time libraries. For details about +DA, see

“Selecting Different Versions of the Math Libraries” on page 27.

Specifying the architecture type of the systems on which your code will

run will probably improve the performance of your code if it makes

substantial use of ﬂoating-point arithmetic or math library calls. See

“Architecture Type of Run-Time System” on page 80 and “BLAS Library

Versions” on page 180 for more information.

If your code will run on PA2.0 systems only, you will probably get the

best performance if you use the default 32-bit code generation option

(+DA2.0) rather than 64-bit code generation (+DA2.0W). Generate 64-bit

code only if your application must do so to avoid the system constraints

of 32-bit systems.