HP-UX Floating-Point Guide

Chapter 7 185
Performance Tuning
Matrix Operations
Matrix Operations
If a bottleneck contains vector and/or matrix operations, you may be able
to improve program performance by specifying the +Ovectorize option.
See “Optimizing Your Program” on page 171 for details.
Alternatively, you may be able to replace the operations with calls to the
BLAS library, libblas (provided with the HP Fortran 90 and HP
FORTRAN/9000 products only).
The libblas and +Ovectorize calls are faster than code loops that you
can write yourself because they take into account alignment, data cache,
and other machine-dependent characteristics. Not all matrices, however,
are good candidates for libblas calls or for +Ovectorize. If the array
contains fewer than about twenty elements, the overhead incurred by
making the calls may offset the increased performance yielded by these
routines.
For more information about the libblas routines, see “The BLAS
Library (libblas)” on page 121, the HP Fortran 90 Programmer’s
Reference, and the HP FORTRAN/9000 Programmer’s Reference.