HP-UX Floating-Point Guide

Chapter 7 185

Performance Tuning

Matrix Operations

If a bottleneck contains vector and/or matrix operations, you may be able

to improve program performance by specifying the +Ovectorize option.

See “Optimizing Your Program” on page 171 for details.

Alternatively, you may be able to replace the operations with calls to the

BLAS library, libblas (provided with the HP Fortran 90 and HP

FORTRAN/9000 products only).

The libblas and +Ovectorize calls are faster than code loops that you

can write yourself because they take into account alignment, data cache,

and other machine-dependent characteristics. Not all matrices, however,

are good candidates for libblas calls or for +Ovectorize. If the array

contains fewer than about twenty elements, the overhead incurred by

making the calls may offset the increased performance yielded by these

routines.

For more information about the libblas routines, see “The BLAS

Library (libblas)” on page 121, the HP Fortran 90 Programmer’s

Reference, and the HP FORTRAN/9000 Programmer’s Reference.