Compiler Usage Guidelines for AMD64 Platforms Application Note Publication # 32035 Revision: 3.
© 2006–2007 Advanced Micro Devices, Inc. All rights reserved. The contents of this document are provided in connection with Advanced Micro Devices, Inc. (“AMD”) products. AMD makes no representations or warranties with respect to the accuracy or completeness of the contents of this publication and reserves the right to make changes to specifications and product descriptions at any time without notice.
2035 Rev. 3.22 November 2007 Compiler Usage Guidelines for AMD64 Platforms Contents Revision History 11 Chapter 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .13 1.1 Audience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .13 1.2 Intent of Document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .13 1.
Compiler Usage Guidelines for AMD64 Platforms 2.6 32035 Rev. 3.22 November 2007 Compilers (32-bit) for Sun Solaris . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18 2.6.1 Sun . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18 Chapter 3 Performance-Centric Compiler Switches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19 3.
32035 Rev. 3.22 November 2007 Compiler Usage Guidelines for AMD64 Platforms 3.7.1 Invocation Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .29 3.7.2 Generic Performance Switches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .29 3.7.3 Other Switches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .29 3.8 GCC Compilers (32-Bit) for Linux® . . . . . . . . . . . . . . . . . . . . . . . . .
Compiler Usage Guidelines for AMD64 Platforms 4.2 Rev. 3.22 November 2007 4.1.1 Interoperability Between Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .41 4.1.2 Run-Time Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .43 4.1.3 Compiled and Linked Code Generates Unexpected Results . . . . . . . . . . . . .43 4.1.4 Program Gives Unexpected Results or Terminates Unexpectedly . . . . . . . . .
32035 Rev. 3.22 November 2007 Compiler Usage Guidelines for AMD64 Platforms 4.10 PathScale Compilers (32-Bit) for Linux® . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .51 4.11 Intel Compilers (32-Bit) for Microsoft® Windows® . . . . . . . . . . . . . . . . . . . . . . . . .51 4.11.1 Compilation Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .51 4.11.2 Compiled and Linked Code Generates Unexpected Results . . . . . . . . . . . . .
Compiler Usage Guidelines for AMD64 Platforms 8 Contents 32035 Rev. 3.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 Tables Table 1. Table 2. Table 3. Table 4. Table 5. Table 6. Table 7. Table 8. Table 9. Table 10. Table 11. Table 12. Table 13. Table 14. Table 15. Table 16. Table 17. Summary of Compilers ...................................................................................................15 GCC Versions Included with Linux® Distributions .......................................................
Compiler Usage Guidelines for AMD64 Platforms 10 32035 Tables Rev. 3.
32035 Rev. 3.22 November 2007 Compiler Usage Guidelines for AMD64 Platforms Revision History Date Rev. Description November 2007 3.22 Made minor corrections. Seventh public release. September 2007 3.21 Sixth public release. August 2006 3.19 Fifth public release. June 2005 3.18 Fourth public release. Updated generic performance switches for Sun Solaris in Section 3.8, Section 3.16, and Section 4.16. June 2005 3.16 Third public release. February 2005 3.09 Second public release.
Compiler Usage Guidelines for AMD64 Platforms 12 32035 Rev. 3.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 Chapter 1 Introduction Independent software vendors (ISVs) and end-users of platforms for the AMD Athlon™ 64, AMD Opteron™, and AMD Family 10h processors have a significant interest in porting and tuning their applications for the AMD64 architecture. Because several compilers are available for AMD64 architecture, evaluating them to choose the best-suited compiler for an application is a non-trivial task.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 openers". Standard Performance Evaluation Corporation (SPEC) designed CPU2006 to provide a comparative measure of computation-intense performance across the widest range of hardware using workloads developed from real user applications. SPECcpu2006 is CPU-intensive—stressing a system's processor, memory subsystem and compiler.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 Chapter 2 List of Compiler Vendors for AMD Processors The compiler vendors listed in this chapter are discussed in detail in subsequent chapters of this application note. This is not a comprehensive list of all compiler vendors for AMD Athlon™ 64, AMD Opteron™ and AMD Family 10h processors. Table 1.
Compiler Usage Guidelines for AMD64 Platforms • Red Hat Enterprise Linux 3 • Red Hat Enterprise Linux 4 32035 Rev. 3.22 November 2007 This application note also briefly discusses the GCC 4.2 compiler, which is the current GCC compiler from the Free Software Foundation (FSF). 2.1.2 Intel Intel provides C, C++, and Fortran compilers for EM64T and compatible architecture-based systems running the Linux operating systems. The current version (as of August 2007) is 10.0. 2.1.
32035 Rev. 3.22 November 2007 Compiler Usage Guidelines for AMD64 Platforms performance on AMD64 next-generation systems and supports features like auto-parallelization, OS native multithreading, OpenMP multithreading models, and MPI programming for AMD64 architecture-based multicore shared-memory and distributed-memory cluster-based systems. The current version (as of Sept 2007) is PGI Release 7.1. 2.3 Compilers (64-bit) for Solaris The following companies provide 64-bit compilers for x86 Solaris. 2.
Compiler Usage Guidelines for AMD64 Platforms 2.4.4 32035 Rev. 3.22 November 2007 PGI The Portland Group (PGI) Toolkits are composed of high performance C, C++, and/or Fortran Compiler(s), a debugger, and a performance profiler for 32-bit and 64-bit AMD64 and EM64T processor-based Linux.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 Chapter 3 Performance-Centric Compiler Switches This chapter describes the various switches that can be useful for individual compilers. For each compiler, a list of generally recommended performance switches is provided. This list is further augmented by other switches that could prove beneficial for certain code bases. 3.
Compiler Usage Guidelines for AMD64 Platforms 3.1.2 32035 Rev. 3.22 November 2007 General Performance Switches To get a program running, start by compiling and linking without optimization. Use the optimization level -O0 or select -g to perform minimal optimization. At this level, you can debug a program easily and isolate any coding errors exposed during porting to x86 or AMD64 platforms. Use option -tp (i.e. target processor) to specify the target architecture.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 -O3 (level-3) specifies aggressive global optimization. This optimization level performs all level-one and level-two optimizations and enables more aggressive hoisting and scalar replacement optimizations that may or may not be profitable. -O4 (level-4) performs all level-1, level-2, and level-3 optimizations and enables hoisting of guarded invariant floating point expressions.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 innovations are automatically incorporated into applications through the use of ACML. The AMD Core Math Library (ACML) revision 4.0, built with PGI Edition 7, includes BLAS, LAPACK, FFT and RNG routines that are optimized for AMD Athlon™ 64 and AMD Opteron™ processors. If the program uses these routines, using ACML in place of generic C/Fortran implementation may greatly improve the performance.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 In addition to the supplied compilers, the user can also experiment with the latest GCC compilers (version 3.4, 4.0, and 4.2.0) from the Free Software Foundation (FSF). Users probably cannot expect, however, the same level of support for FSF GCC compilers as they can expect for supplied compilers. 3.2.2 Invocation Commands The following commands invoke specific compilers: · gcc invokes the C compilers for gcc 4.1, 3.4.1, 3.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 The GCC 4.0 and later version compilers can perform loop vectorization by using the -ftree-vectorize flag. 3.2.4 Other Switches In addition to the switches mentioned in Table 3 on page 23, the following list of switches may also improve the program performance. It is worth experimenting with these switches. -march=k8. For the FSF GCC 4.2.0 SuSE 4.2.0and Red Hat 4.2.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 implementation may greatly improve the performance. For additional details on how to install this library and use it, see http://developer.amd.com/assets/acml_userguide.pdf. -fno-rtti. This switch disables generation of information about every class, with virtual functions, for use by the C++ runtime type identification features (dynamic_cast and typeid).
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 -fno-rtti. Using this switch instructs the C++ compiler to discard C++ run-time type information (RTTI). This may improve performance. However, C++ features requiring RTTI (exceptions, dynamic cast, etc.) will not be supported. -ansi-alias. Try this switch if the program strictly conforms to the ISO C99 standard. If the program adheres to the standard, this switch allows the compiler to perform aggressive optimizations. 3.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 uses these routines, using ACML in place generic C/Fortran implementation may greatly improve the performance. Use the GNU64 libraries of ACML for the 64-bit PathScale compiler. For additional details on how to install this library and use it, see http://developer.amd.com/assets/acml_userguide.pdf. Refer to the PathScale EKOPath Compiler Suite User Guide, Version 2.
Compiler Usage Guidelines for AMD64 Platforms 3.6 32035 Rev. 3.22 November 2007 Microsoft® Compilers (64-Bit) for Microsoft® Windows® Microsoft provides C/C++ compilers for AMD64 architecture-based systems running the Microsoft Windows operating system. The current version is Visual Studio 2008. This document contains the latest C/C++ compiler recommendations for Visual Studio 2008. All the options described below apply to this version of the compiler. 3.6.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 3.7 Sun Compilers (64-bit) for Solaris Sun provides C, C++, and Fortran compilers for AMD64 architecture-based systems running the Solaris operating system. The current version (as of August, 2007) is version 5.9 available in the Sun Studio 12 developer tools suite. All the options described below apply to this version of the compiler. 3.7.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 -xprofile=collect:[name] flag, run the program on a typical dataset. Then compile with -xprofile=use:[name] to utilize the resulting profile data to tune the program. The -xcrossfile flag enables optimization across all source files. This flag must be combined with -xO4 or -xO5 to be effective. The -xipo=2 flag enable interprocedural optimization (this option is preferred over -xcrossfile, which was pre-ipo).
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 GCC Versions Included with Linux® Distributions Table 5. Red Hat Enterprise Linux 4 3.4.1 No optional compiler available with the distribution. The default compiler is the recommended compiler. SuSE Linux 10.1 4.1.1 4.2.0 SuSE Linux Enterprise Server 10 4.1.0 4.2.0 Table 4, “Profile Guided Optimization for 64-Bit GCC Compilers for Linux®,” on page 24 identifies the recommended optional compilers by their package names.
Compiler Usage Guidelines for AMD64 Platforms Table 6. 32035 Rev. 3.22 November 2007 Recommended Option Switches for 32-Bit GCC Compilers for Linux® SuSE GCC 4.2.0 (for C/C++ and Fortran) and Red Hat gcc-ssa (for C/C++ and Fortran) -O3 -march=k8 -ffast-math -fomit-frame-pointer -malign-double -mfpmath=sse FSF GCC 4.2.0 Red Hat GCC 3.4.1 -O3 -march=k8 -ffast-math -fomit-frame-pointer -malign-double -mfpmath=sse -fpeel-loops -ftracer -funswitch-loops -funit-at-a-time SuSE GCC 4.2.
32035 Rev. 3.22 November 2007 Table 7. Compiler Usage Guidelines for AMD64 Platforms Profile Guided Optimization for 32-Bit GCC Compilers for Linux® SuSE GCC 4.2.0 (for C/C++ and Fortran) and Red Hat gcc-ssa (for C/C++ and Fortran) and SuSE GCC 4.2.0 Step 1.Compile the program with -fprofile-arcs. Step 2.Run the executable produced in Step 1. Running this executable generates several files with profile information (*.da). Step 3.Recompile the program with -fbranch-probabilities. FSF GCC 4.2.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 -fno-rtti. This switch disables generation of information about every class with virtual functions for use by the C++ runtime type identification features (dynamic_cast and typeid). If the user does not use those parts of the language, some space can be conserved by using this switch. Users can obtain more details on these switches by trying info gcc on their Linux systems. 3.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 3. Recompile the program with the -prof_use switch. It is recommended to also use the -ipo switch in this stage. -nolib_inline. For programs with many calls to memory-related library routines (such as, memmove and memcopy), using the -nolib_inline switch may improve performance for Intel compiler versions 7.1 and 8.0. This switch is not recommended for version 9.1. -unroll[n].
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 3. Recompile the program with the -fb_opt fbdata switch. Inter-Procedure Optimization. Use the -ipa switch to enable inter-procedure optimization. -Ofast. For aggressive optimization, use the -Ofast switch. This is the shorthand for the switches -O3, -OPT:Ofast, -ipa, and -fno-math-errno. Linking with ACML.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 2. Run the executable produced in Step 1. Running the executable generates several files with profile information (*.dyn and *.dpi). 3. Recompile the program with the -Qprof_use switch. It is recommended to also use the -Qipo switch in this stage. -Oi-. For programs with many calls to memory-related library routines (such as, memset and memcpy), using the -Oi- switch may improve performance for Intel compiler versions 7.1 and 8.0.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 Profile Guided Optimization. The 32-bit Microsoft compiler allows profile guided optimization. Use the following steps for profile guided optimization with 32-bit Microsoft compilers for Microsoft Windows. 1. Compile the program with the /GL switch and link with the /LTCG:PGI switch. 2. Run the executable produced in Step 1. Running this executable generates several files with profile information. 3.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 3.13.3 Other Switches In addition to the generic switches, the following switches may improve the performance of the program. It is worth experimenting with these switches. Use the -xO[1|2|3|4|5] switch to enable various levels of general optimization algorithms. Usually using a higher number results in faster execution, but in some cases -xO2 or -xO3 may be faster than -xO4 or -xO5.
Compiler Usage Guidelines for AMD64 Platforms 40 Performance-Centric Compiler Switches 32035 Rev. 3.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 Chapter 4 Troubleshooting and Portability Issues Tuning code for optimal performance presents a wide variety of challenges from compilation errors to unexpected results. This chapter presents the developer with a series of diagnostic steps for a given compiler to troubleshoot errors encountered when compiling or running code.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 with the -Mupcase switch. This switch prevents the compiler from converting symbol names to lower-case. To match the underscore appended by the compiler to global symbol names in Fortran, use the following function naming convention. 1. When calling a C/C++ function from Fortran, rename the C/C++ function by appending an underscore. 2.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 Fortran and C/C++ arrays also use different storage methods. Fortran uses column-major order, and C/C++ uses row-major order. This poses no problems for one-dimensional arrays. For twodimensional arrays, where there are an equal number of rows and columns, simply reverse the row and column indices. For arrays other than single dimensional arrays, and square two-dimensional arrays, inter-language function mixing is not recommended.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 As a diagnostic step, try building the program using x87 operations for floating-point computations and see if the results are as expected. Use the -tp=k8-32 and -fast switches instead of the switches recommended in the general performance guidelines. Because not using those switches recommended in the general performance guidelines could lower performance, the user should investigate the precision requirements of the program.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 4.2.2 Link-Time Errors Are you trying to link C and Fortran code? Turn on the -fno-f2c switch for compiling Fortran 77 modules with g77. Turning on the -fno-f2c switch prevents g77 from generating code designed to be compatible with code generated by f2c and uses the GNU calling conventions instead. 4.2.3 Run-Time Errors Is your code causing buffer overruns? Turn on the -fbounds-check switch.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 GCC provides switches, such as the -mieee-fp switch, to control whether or not the compiler uses IEEE floating-point comparisons. The user should not use the -ffast-math optimization recommended in the general optimization guidelines in this case. Using the -ffast-math switch results in a fast but less predictable floatingpoint model. The user should also be careful to not use a switch that implies -ffast-math.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 4.6 Microsoft® Compilers for (64-Bit) Microsoft® Windows® This section addresses errors and unexpected results that may be encountered when using 64-bit Microsoft® compilers for Microsoft Windows®. 4.6.1 Compilation Errors Does your code suffer from 64-bit portability issues such as type-casting pointers to int or long? Use the /Wp64 switch to detect 64-bit porting problems.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 Enable exception handling with the appropriate /EH switch. 4.7 Sun Compilers (64-bit) for Solaris See section 4.13, “Sun Compilers (32-bit) for Solaris”, on page 54 for the portability and troubleshooting issues with this compiler. 4.8 GCC Compilers (32-Bit) for Linux® This section addresses errors and unexpected results that may be encountered when using 32-bit GNU Compiler Collection (GCC) compilers for Linux®. 4.8.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 within the declared range. The -fbounds-check switch is currently supported only by the Fortran 77 front-end, in which this option defaults to false. Are you building a shared library? Turn on the -fPIC switch if you need position-independent code suitable for use in a shared library. 4.8.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 By default, GCC enables the -fexceptions option for languages like C++ that normally require exception handling. GCC disables the -fexceptions option for languages like C that do not normally require it. You may need, however, to enable this option when compiling C code that must interoperate properly with exception handlers written in C++.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 Because not using the -xK and -xW switches could lower performance, the user should investigate the precision requirements of the program. If the user has access to the source code, it may be possible to adapt the algorithm to SSE2. 4.9.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 Does your program rely on x87 features? Some Intel compiler switches instruct the compiler to use SSE2 registers and instructions. If the results do not match your expectations when using SSE2, the program may rely on some x87 features. As a diagnostic step, try building the program using x87 operations for floating-point computations and see if the results are as expected.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 4.12.1 Run-Time Errors Is your code causing buffer overruns that violate security? Turn on the /GS switch. Turning on the /GS switch causes the Microsoft compiler to generate additional security code, such as bounds checking. 4.12.2 Compiled and Linked Code Generates Unexpected Results Does your program depend on precise floating-point behavior? Do not use the /fp:fast switch recommended in the general performance guidelines.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 Use portable, scalable data types like INT_PTR, UINT_PTR, LONG_PTR, and ULONG_PTR for type-casting pointers. Issues such as these can be detected by using the /Wp64 switch. 4.13 Sun Compilers (32-bit) for Solaris This section addresses errors and unexpected results that may be encountered when using 32-bit Sun compilers for Solaris. 4.13.
32035 Rev. 3.22 November 2007 Chapter 5 Compiler Usage Guidelines for AMD64 Platforms Peak Options for SPEC®-CPU Benchmark Programs This chapter enumerates the best-known peak switches (as of September 2007) for SPEC®-CPU2006 benchmarks compiled for AMD Athlon™ 64, AMD Opteron™ and AMD Family 10h processorbased platforms by different compilers. 5.1 PGI Release 7.
Compiler Usage Guidelines for AMD64 Platforms • 32035 Rev. 3.22 November 2007 All remaining integer components of CINT2006 pgcc -w -fast -Mipa=fast, inline, noarg -Mfprelaxed -Msmartalloc=huge:840 -tp barcelona-64 -DSPEC_CPU_LP64 pgcpp -w -fastsse -Mipa=fast,inline -Mfprelaxed -Msmartalloc=huge:448 --zc_eh -tp barcelona -DSPEC_CPU_LP64 The following command-line options are used for the base floating-point component of SPECcpu2006 (CFP2006): • 435.gromacs, 436.cactusADM, and 454.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 5.1.2 Peak Command-line Options The table below specifies the best-known peak switches for various benchmarks in the SPECcpu2006 suite for the 64-bit PGI Release 7.1 compilers for Linux® on AMD Athlon™ 64 processor based platforms and AMD Opteron™ processor-based platforms. Table 10.
Compiler Usage Guidelines for AMD64 Platforms Table 10. 32035 Rev. 3.22 November 2007 Best-Known Peak Switches for the 64-Bit PGI Compilers for Linux® Application Area Benchmark XML Processing 483.xalancbmk Language Best Known Peak Switches C++3 pgcpp -w -fastsse -O4 -Mipa=fast, inline -Mfprelaxed -Msmartalloc --zc_eh -tp Barcelona -DSPEC_CPU_LINUX CFP2006 Fluid Dynamics 410.bwaves Quantum Chemistry 416.gamess Physics/Quantum Chromodynamics 433.milc Physics / CFD 434.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 Table 10. Best-Known Peak Switches for the 64-Bit PGI Compilers for Linux® Application Area Benchmark Language Best Known Peak Switches Computational 459.GemsFDTD Electromagnetics pgf95 -w -fast -O4 -Mdse -Mipa=fast,inline Fortran 90 -Mfprelaxed -Msmartalloc=huge:448 -tp barcelona-64 -DSPEC_CPU_LP64 Quantum Chemistry 465.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 By default all benchmark programs use the following option: OPTIMIZE = -stack=nocheck,39000000,39000000 Note: INT base C++ are compiled as 32-bit binaries. SmartHeap libraries are required for INT C++ base. • 400.perlbench pgcc -w -fast -Mipa=fast, inline, noarg -Mfprelaxed -Msmartalloc=huge:16 -tp barcelona-64 -DSPEC_CPU_WIN64_X64 • 403.
32035 Rev. 3.22 November 2007 • Compiler Usage Guidelines for AMD64 Platforms 436.cactusADM pgcc -w -fast -Mipa=fast, inline -Mfprelaxed -tp barcelona-64 -DSPEC_CPU_WIN64_X64 pgf95 -w -fast -Mipa=fast,inline -Mfprelaxed -Mnomain -tp barcelona-64 -DSPEC_CPU_WIN64_X64 • 453.povray pgcpp -w -fast -Mipa=fast, inline -Mfprelaxed -zc_eh -tp barcelona-64 -DSPEC_CPU_INVHYP -DNEED_INVHYP • 454.
Compiler Usage Guidelines for AMD64 Platforms 5.2.3 32035 Rev. 3.22 November 2007 Peak Command-line Options The table below delineates the best-known peak switches for various benchmarks in the SPECcpu2006 suite for the 64-bit PGI Release 7.1 compilers for Windows® on AMD Athlon™ 64, AMD Opteron™ and Amd Family 10h processor-based platforms. Table 11.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 Table 11. Best-Known Peak Switches for the 64-Bit PGI Compilers for Microsoft® Windows® Application Area Benchmark XML Processing 483.xalancbmk Language Best Known Peak Switches Use base binaries and/or base results for peak and also srcalt=pgiwin. C++ CFP2006 Fluid Dynamics 410.bwaves Quantum Chemistry 416.gamess Physics/Quantum Chromodynamics 433.milc Physics / CFD .434.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 Best-Known Peak Switches for the 64-Bit PGI Compilers for Microsoft® Windows® Table 11. Application Area Benchmark Image Ray-Tracing 453.povray Structural Mechanics 454.calculix Language Best Known Peak Switches ISO C++ Use base binaries and/or base results for peak. C Use base binaries and/or base results for peak. Fortran90 Use base binaries and/or base results for peak. Computational 459.
32035 Rev. 3.22 November 2007 Compiler Usage Guidelines for AMD64 Platforms Best-Known Peak Switches for the 64-Bit SuSE GCC 3.3.3 C/C++ Compiler for Linux® (Continued) Table 12. 164.gzip: -O3 -funroll-all-loops -finline-limit=900 -freduce-all-givs and -fprofile-arcs/-fbranch-probabilities 175.vpr: -O3 -funroll-all-loops -finline-limit=1000 and -fprofile-arcs/-fbranch-probabilities 176.gcc: -O3 -funroll-all-loops -finline-limit=900 and -fprofile-arcs/-fbranch-probabilities 181.
Compiler Usage Guidelines for AMD64 Platforms 5.4 32035 Rev. 3.22 November 2007 Pathscale EKO 3.0 C/C++ Compiler (64-Bit) for Linux® Table 13 shows the best-known peak switches for various benchmarks in the SPEC-CPU2000 suite for the PathScale C/C++ compiler (64-bit) for Linux® on AMD Athlon™ 64 processor-based platforms and AMD Opteron™ processor-based platforms. Table 13. Best-Known Peak Switches for the Pathscale 1.4 C/C++ Compiler for Linux® Benchmark Program Best-Known Peak Switches 164.
32035 Rev. 3.22 November 2007 5.5 Compiler Usage Guidelines for AMD64 Platforms Pathscale EKO 3.0 Fortran Compiler (64-bit) for Linux® Table 14 shows the best-known peak switches for various benchmarks in the SPEC-CPU2000 suite for the Pathscale Fortran compiler (64-bit) for Linux® on AMD Athlon™ 64 processor-based platforms and AMD Opteron™ processor-based platforms. Table 14. Best-Known Peak Switches for the 64-bit Pathscale 2.4 Fortran Compiler for Linux® Benchmark Program 168.
Compiler Usage Guidelines for AMD64 Platforms 32035 Rev. 3.22 November 2007 Intel 9.0 C/C++ Compiler for (32-Bit) Microsoft® Windows® 5.6 Table 15 shows the best-known peak switches for various programs in the SPEC-CPU2000 benchmarks for the 32-bit Intel 8.0 C/C++ compiler for Microsoft Windows on AMD Athlon™ 64 processor-based platforms and AMD Opteron™ processor-based platforms. Table 15. Best-Known Peak Switches for the 32-Bit Intel 8.0 C/C++ Compiler for Microsoft® Windows® Benchmark Program 164.
32035 Rev. 3.22 November 2007 5.7 Compiler Usage Guidelines for AMD64 Platforms Sun C/C++ Compiler (64-bit) for Solaris Table 16 shows the best-known peak switches for various programs in the SPEC-CPU2000 benchmarks for the 64-bit Sun C and C++ compilers (version 5.7) for Solaris on AMD Athlon™ 64 processor-based platforms and AMD Opteron™ processor-based platforms. Table 16. Best-Known Peak Switches for the 64-bit Sun C/C++ Compilers for Solaris Benchmark Program Best-Known Peak Switches 164.
Compiler Usage Guidelines for AMD64 Platforms Table 17. 32035 Rev. 3.22 November 2007 Best-Known Peak Switches for the 64-bit Sun Fortran Compiler for Solaris Benchmark Program 168.wupwise: Best-Known Peak Switches -fast -xipo=2 -xarch=amd64 -xprefetch_level=3 -xpagesize_heap=2m 171.swim: -fast -xipo=2 -xprefetch_level=3 -Qoption iropt -Atile:skewp,-Ainline:cs=700 -xarch=amd64 -qoption ube_ipa -inl_alt -xpagesize_stack=2m 172.