Optimizing Itanium-Based Applications (May 2011)

Optimizing Itanium-Based Applications
5
Performs Level 2 optimizations, plus optimizations across the entire application program.
Performs interprocedural optimizations (IPO) at link time, including improved range propagation
and alias analysis, cross module inlining, interprocedural data prefetching, dead variable and dead
function removal, variable privatization, short data optimization, data layout optimization, constant
propagation, and import stub inlining.
Performs indirect call promotion in whole program mode if dynamic PBO data is available
(+Oprofile=use).
Performs inlining of a larger set of math library routines into user code.
See chapter on interprocedural optimizations below for more details.
This level of optimization limits the ability to debug the application. See Section 14.27 (Debugging
optimized code) in Debugging with GDB[2] for more information on this topic.
benefits:
Better alias information and inlining improves and enables additional optimizations over Level 2.
Applications containing many indirect calls or virtual function calls can benefit greatly from
indirect call promotion.
Data optimizations improve cache and TLB behavior.
Code optimizations reduce number of instructions.
level three
+O3
description:
Performs Level 2 optimizations, plus optimizations across all functions in a single file.
Includes inlining and cloning of functions within the same file.
High-level optimizations, such as loop transformations (interchange, fusion, unrolling, and so on)
occur. Please see the section about loop optimization below.
Performs inlining of a larger set of math library routines into user code.
Recognizes simple copy loops and replaces them with calls to optimized memory copy routines.
Recognizes simple manually unrolled loops and rerolls them, enabling better unrolling decisions
for a given platform later in the loop optimizer.
This level of optimization limits the ability to debug the application. See Section 14.27 (Debugging
optimized code) in Debugging with GDB[2] for more information on this topic.
benefits:
Can produce faster code than Level 2. This is particularly true for numerical codes, which tend to
benefit more from the loop transformations, and for codes that frequently call small functions
within the same file or math library functions, which benefit from inlining.
level four (level three –ipo)
+O4 or +O3 -ipo
description:
Performs Level 3 optimizations, plus optimizations across the entire application program.
Performs interprocedural optimizations at link time, please see “level two -ipo” for a summary.
This level of optimization limits the ability to debug the application. See Section 14.27 (Debugging
optimized code) in Debugging with GDB[2] for more information on this topic.
benefits:
Interprocedural optimizations generally improve application performance (see Level two -ipo).