Specifications

38 Performance-Centric Compiler Switches Chapter 3
32035 Rev. 3.22 November 2007
Compiler Usage Guidelines for AMD64 Platforms
Profile Guided Optimization. The 32-bit Microsoft compiler allows profile guided optimization.
Use the following steps for profile guided optimization with 32-bit Microsoft compilers for Microsoft
Windows.
1. Compile the program with the /GL switch and link with the /LTCG:PGI switch.
2. Run the executable produced in Step 1. Running this executable generates several files with
profile information.
3. Relink the program with the /LTCG:PGO switch.
The -arch:SSE2 switch allows the compiler to use the SSE2 instructions, when it determines that it is
faster than x87 for scalar, floating-point computations and will interleave the two as appropriate. As a
result, the code uses a mixture of both x87 and SSE2. Using this switch almost always results in
increased speed.
The compiler emits code that is thread-safe by default. Turning off this default by using
/D_ST_MODEL can result in an additional performance improvement.
/OPT:ref,icf. This linker option removes redundant symbols and unused functions, resulting in a
smaller binary.
3.13 Sun Studio Compilers (32-bit) for Solaris
Sun Microsystems provides C, C++, and Fortran compilers for the x86 Solaris operating system. The
current version of each compiler (as of August 2007) is 5.9, and is available in the Sun Studio 12
developer tools suite. All options below apply to this version of the compilers.
3.13.1 Invocation Commands
The following commands invoke specific compilers:
cc invokes the Sun Studio C compiler.
CC invokes the Sun Studio C++ compiler.
f77 invokes the Sun Studio Fortran 77 compiler.
f90 invokes the Sun Studio Fortran 90 compiler.
3.13.2 Generic Performance Switches
Different optimization switches are recommended for different platforms. The -fast switch enables a
number of optimizations that optimize the execution time on the compilation platform. If the program
will be run on a different machine, -fast can be combined with -xtarget to optimize for a different
platform. If performance on a wide variety of systems is desired, combine
-xtarget=generic with -fast. If a switch implied by -fast (e.g., -xarch=isa) is overridden, that switch
must follow -fast on the command line, or it will be ignored.