HP-UX Floating-Point Guide

Chapter 7 177
Performance Tuning
Inefficient Code
32-bit code can run faster than 64-bit code for the following reasons:
Code and data size is smaller.
Array indexing and pointer arithmetic are faster.
Position-dependent (absolute) code is faster than
position-independent code (PIC), and 64-bit code is always
position-independent.
For more information, see the HP-UX 64-bit Porting and Transition
Guide.
Use of the +DA2.0 option to generate PA2.0 32-bit code will improve the
performance of your application even more if the source provides
opportunities for the compiler to generate FMA (fused multiply-add)
instructions (see “Architecture Type of Run-Time System” on page 80 for
details). For example, if two statements like
c = a * b
and
e = c - d
are separated by intervening statements in your program, you may want
to place them one right after the other or to combine them into
e = a * b - d
This kind of rearrangement will be most effective if done within loops.
The +DS option also has a significant effect on performance, because it
specifies an architecture-specific instruction scheduler. If your code must
be portable across all HP 9000 architectures, you must compile with
+DA1.1, but you may compile with either +DS1.1 or +DS2.0. Use
+DS2.0 if you want to achieve the best possible performance on PA2.0
systems. See the appropriate HP language reference manual for more
information about this option.
Including Debugging Information
All HP 9000 compilers allow you to include debugging information in the
object file at optimization levels 0, 1, and 2. Debugging information
increases the size of the object code. The debugging option is extremely
useful during program development, but for the final product you should
compile without it. (You may, however, prefer to accept this performance
degradation in order to make your product easier to support.)