HP aC++/HP C A.06.20 Release Notes

+O[no]autopar Option (New)

+O[no]autopar

This release adds support on the Itanium platform for a new optimization

-auto-parallelization - which is enabled by adding the +Oautoparoption to the

command-line. This optimization allows applications to exploit otherwise idle resources

on multicore or multiprocessor systems by automatically transforming serial loops into

multithreaded parallel code.

When the +Oautoparoption is used at optimization levels +O3and above, the compiler

will automatically parallelize those loops that are deemed safe and profitable by the

loop transformer.

The default is +Onoautopar, which disables automatic parallelization of loops.

Automatic parallelization can be combined with manual parallelization through the

use of OpenMP directives and the +Oopenmp option. When both +Oopenmp and

+Oautopar options are specified, the compiler honors the OpenMP directives first,

and then looks for loops that have not been parallelized manually with OpenMP

directives. For these loops, the compiler automatically parallelizes each loop that is

both safe and likely to have improved performance when executed in parallel.

Programs compiled with the +Oautopar option require the libcps, libomp, and

libpthreads runtime support libraries to be present at both compilation and runtime.

When linking with the HP-UX B.11.61 linker (patch PHSS_36342 or PHSS_36349),

compiling with the +Oautoparoption causes them to be automatically included. Older

linkers require those libraries to be specified explicitly or by compiling with +Oopenmp.

At present, +Oautoparis only supported when compiling C or Fortran files, and not

C++ files. If you use +Oautopar with C or Fortran code in a mixed-language application

that also includes C++ files, you must use -mt when compiling and linking the C++

files, similar to the current requirements for +Oopenmp. Please refer to the

documentation of the aCC compiler's-mt option for additional information and

restrictions.

+O[no]loop_block Option (New)

+O[no]loop_block

Loop blocking is a combination of strip mining and interchange that improves data

cache locality. It is provided primarily to deal with nested loops that manipulate arrays

that are too large to fit into the data cache. Under certain circumstances, loop blocking

allows reuse of these arrays by transforming the loops that manipulate them so that

they manipulate strips of the arrays that fit into the cache.

At optimization levels 3 and 4, using +Oloop_block (the default) allows automatic

loop blocking. Specifying +Onoloop_block disables loop blocking.

20 What’s New in This Version