Specifications

24 Performance-Centric Compiler Switches Chapter 3
32035 Rev. 3.22 November 2007
Compiler Usage Guidelines for AMD64 Platforms
The GCC 4.0 and later version compilers can perform loop vectorization by using the
-ftree-vectorize flag.
3.2.4 Other Switches
In addition to the switches mentioned in Table 3 on page 23, the following list of switches may also
improve the program performance. It is worth experimenting with these switches.
-march=k8. For the FSF GCC 4.2.0 SuSE 4.2.0and Red Hat 4.2.0 compilers, using this switch may
give you a performance advantage in some cases.
-march=amdfam10. For applications to be executed on AMD Family 10h processor-based
platforms, this switch results in better performance.
Note: The amdfam10 option is not available on all GCC compiler releases. See your compiler
documentation for further information.
Profile Guided Optimization. The 64-bit GCC compiler also allows profile guided optimization.
Table 4 shows the profile guided optimization switches for the different GCC compilers.
-Bsymbolic
. Sarting from GCC 4.1, gcc compiler no longer requires the -Bsymbolic switch. GCC
4.1 and later versions offer -combine -fwhole -program, which should be used together, but require
that makefiles be changed to use a single command to compile and link all files of an application,
slowing down builds. So it should only be used for non-debug builds. Unfortunately, these options
may fail compiling some files.
-minline-all-stringops. When using the GCC 3.4 compiler on Red Hat Enterprise Linux 4,
experiment with the switch -minline-all-stringops. This switch is not recommended for GCC 3.4 on
SuSE Linux Enterprise Server.
Linking with ACML. The AMD Core Math Library (ACML) includes BLAS, LAPACK and FFT
routines that are optimized for AMD Athlon™ 64, AMD Opteron™, and AMD Family 10h
processors. If the program uses these routines, using ACML in place of generic C/Fortran
Table 4. Profile Guided Optimization for 64-Bit GCC Compilers for Linux
®
Compiler Version Optimization Switches
SuSE GCC 4.2.0
and
Red Hat gcc-ssa
(for C/C++ and Fortran)
Step 1.Compile the program with -fprofile-arcs.
Step 2.Run the executable produced in Step 1. Running the
executable generates several files with profile
information (*.da).
Step 3.Recompile the program with -fbranch-probabilities.
FSF GCC 4.2.0 and
Red Hat GCC 4.2.0
Step 1.Compile the program with -fprofile-generate.
Step 2.Run the executable produced in Step 1. Running the
executable generates several files with profile
information (*.da).
Step 3.Recompile the program with -fprofile-use.