User Guide
Contents iii
24592—Rev. 3.15—November 2009 AMD64 Technology
Procedure Stack ............................................................77
Jumps ....................................................................78
Procedure Calls ............................................................79
Returning from Procedures ...................................................81
System Calls ..............................................................84
General Considerations for Branching ..........................................84
Branching in 64-Bit Mode ....................................................85
Interrupts and Exceptions ....................................................86
3.8 Input/Output...............................................................90
I/O Addressing.............................................................90
I/O Ordering...............................................................91
Protected-Mode I/O .........................................................92
3.9 Memory Optimization .......................................................92
Accessing Memory .........................................................93
Forcing Memory Order ......................................................94
Caches ...................................................................96
Cache Operation ...........................................................97
Cache Pollution ............................................................98
Cache-Control Instructions ...................................................99
3.10 Performance Considerations .................................................101
Use Large Operand Sizes....................................................101
Use Short Instructions ......................................................101
Align Data ...............................................................101
Avoid Branches ...........................................................101
Prefetch Data .............................................................101
Keep Common Operands in Registers..........................................102
Avoid True Dependencies ...................................................102
Avoid Store-to-Load Dependencies............................................102
Optimize Stack Allocation...................................................102
Consider Repeat-Prefix Setup Time ...........................................102
Replace GPR with Media Instructions..........................................102
Organize Data in Memory Blocks .............................................103
3.11 Cross-Modifying Code .....................................................103
4 128-Bit Media and Scientific Programming .....................................105
4.1 Overview ................................................................105
Origins ..................................................................105
Compatibility .............................................................105
4.2 Capabilities ..............................................................106
Types of Applications ......................................................106
Integer Vector Operations ...................................................106
Floating-Point Vector Operations .............................................107
Data Conversion and Reordering..............................................108
Block Operations ..........................................................110
Matrix and Special Arithmetic Operations ......................................112
Branch Removal ..........................................................114
4.3 Registers.................................................................116
XMM Registers ...........................................................116