HP-UX Floating-Point Guide

Chapter 7 173
Performance Tuning
Inefficient Code
+O[no]dataprefetch +Odataprefetch, which has an
effect only when you use the +DA2.0
option, inserts instructions within
innermost loops to fetch data from
memory into the data cache ahead of
time so that it is already there when
it is needed. Data prefetch
instructions are inserted only for
data structures referenced within
innermost loops using simple loop
varying addresses (that is, in a
simple linear sweep across large
amounts of memory). It is useful for
applications that have high data
cache miss overhead; that is, it
improves the performance of
operations on arrays that are so large
they exceed the size of the cache.
As a general rule of thumb, using
+Odataprefetch will probably help
performance if your application
contains numerous references to
arrays, and if the sum of the sizes of
all the arrays in your program totals
more than a megabyte. It can also
help if your application contains only
a single pass through an extremely
large array (tens of megabytes in
size). However, if your program
contains very frequent references to
small arrays, +Odataprefetch can
actually impair performance.
Therefore, the only way to find out for
sure whether this option will help
your program is to try it.
The +Odataprefetch option is
effective with both vectorized and
unvectorized loops. In fact, if your
PA2.0 application uses very large
arrays, you may gain considerable
performance benefit from using