Specifications

ManualsBrandsIntel Manualscomputer componentscore 2 Duo T5850

Document Number: 252046-026

Intel

64 and IA-32 Architectures

Software Developer’s Manual

Documentation Changes

December 2009

Notice: The Intel

64 and IA-32 architectures may contain design defects or errors known as errata

that may cause the product to deviate from published specifications. Current characterized errata are

documented in the specification updates.

Summary of content (292 pages)

PAGE 1
Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes December 2009 Notice: The Intel® 64 and IA-32 architectures may contain design defects or errors known as errata that may cause the product to deviate from published specifications. Current characterized errata are documented in the specification updates.
PAGE 2
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, Legal Lines and Disclaimers BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT.
PAGE 3
Contents Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Summary Tables of Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Documentation Changes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
PAGE 4
Revision History Revision History Revision Description -001 • Initial release November 2002 -002 • • Added 1-10 Documentation Changes. Removed old Documentation Changes items that already have been incorporated in the published Software Developer’s manual December 2002 • • Added 9 -17 Documentation Changes. Removed Documentation Change #6 - References to bits Gen and Len Deleted.
PAGE 5
Revision History Revision Description Date -024 • • Removed Documentation Changes 1-21 Added Documentation Changes 1-16 June 2009 -025 • • Removed Documentation Changes 1-16 Added Documentation Changes 1-18 September 2009 -026 • • Removed Documentation Changes 1-18 Added Documentation Changes 1-15 December 2009 § Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes 5
PAGE 6
Revision History 6 Intel® 64 and IA-32 Architectures Software Developer’s Manual Documentation Changes
PAGE 7
Preface Preface This document is an update to the specifications contained in the Affected Documents table below. This document is a compilation of device and documentation errata, specification clarifications and changes. It is intended for hardware system manufacturers and software developers of applications, operating systems, or tools.
PAGE 8
Summary Tables of Changes Summary Tables of Changes The following table indicates documentation changes which apply to the Intel® 64 and IA-32 architectures. This table uses the following notations: Codes Used in Summary Tables Change bar to left of table row indicates this erratum is either new or modified from the previous version of the document. Documentation Changes No.
PAGE 9
Documentation Changes Documentation Changes 1. Updates to Chapter 3, Volume 2A Change bars show changes to Chapter 3 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2A: Instruction Set Reference, A-M. -----------------------------------------------------------------------------------------... 3.1.1 Instruction Format The following is an example of the format used for each instruction description in this chapter. The heading below introduces the example.
PAGE 10
Documentation Changes 3.1.1.4 64-bit Mode Column in the Instruction Summary Table The “64-bit Mode” column indicates whether the opcode sequence is supported in 64-bit mode. The column uses the following notation: • Valid — Supported. • Invalid — Not supported. • N.E. — Indicates an instruction syntax is not encodable in 64-bit mode (it may represent part of a sequence of valid instructions in other modes). • N.P. — Indicates the REX prefix does not affect the legacy instruction in 64-bit mode.
PAGE 11
Documentation Changes AAA—ASCII Adjust After Addition Opcode Instruction Op/ En 64-bit Mode Compat/ Description Leg Mode 37 AAA A Invalid Valid ASCII adjust AL after addition. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A NA NA NA NA ... 64-Bit Mode Exceptions #UD If in 64-bit mode. ...
PAGE 12
Documentation Changes AAS—ASCII Adjust AL After Subtraction Opcode Instruction Op/ En 64-bit Mode Compat/ Description Leg Mode 3F AAS A Invalid Valid ASCII adjust AL after subtraction. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A NA NA NA NA ... ADC—Add with Carry Opcode Instruction Op/ En 64-bit Mode Compat/ Description Leg Mode 14 ib ADC AL, imm8 C Valid Valid Add with carry imm8 to AL.
PAGE 13
Documentation Changes Opcode Instruction Op/ En 64-bit Mode Compat/ Description Leg Mode REX.W + 11 /r ADC r/m64, r64 A Valid N.E. Add with CF r64 to r/m64. 12 /r ADC r8, r/m8 A Valid Valid Add with carry r/m8 to byte register. REX + 12 /r ADC r8*, r/m8* A Valid N.E. Add with carry r/m64 to byte register. 13 /r ADC r16, r/m16 A Valid Valid Add with carry r/m16 to r16. 13 /r ADC r32, r/m32 A Valid Valid Add with CF r/m32 to r32. REX.W + 13 /r ADC r64, r/m64 A Valid N.
PAGE 14
Documentation Changes Opcode Instruction REX.W + 83 /0 ib 00 /r 64-bit Mode Compat/ Description Leg Mode ADD r/m64, imm8 B Valid N.E. Add sign-extended imm8 to r/m64. ADD r/m8, r8 A Valid Valid Add r8 to r/m8. * Op/ En * REX + 00 /r ADD r/m8 , r8 A Valid N.E. Add r8 to r/m8. 01 /r ADD r/m16, r16 A Valid Valid Add r16 to r/m16. 01 /r ADD r/m32, r32 A Valid Valid Add r32 to r/m32. REX.W + 01 /r ADD r/m64, r64 A Valid N.E. Add r64 to r/m64.
PAGE 15
Documentation Changes ADDPS—Add Packed Single-Precision Floating-Point Values Opcode Instruction Op/ En 64-bit Mode Compat/ Description Leg Mode 0F 58 /r ADDPS xmm1, xmm2/m128 A Valid Valid Add packed single-precision floating-point values from xmm2/m128 to xmm1. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) NA NA ...
PAGE 16
Documentation Changes ADDSUBPD—Packed Double-FP Add/Subtract Opcode Instruction Op/ En 66 0F D0 /r ADDSUBPD xmm1, A xmm2/m128 64-bit Mode Compat/ Description Leg Mode Valid Valid Add/subtract doubleprecision floating-point values from xmm2/m128 to xmm1. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) NA NA ...
PAGE 17
Documentation Changes Opcode Instruction 83 /4 ib 64-bit Mode Compat/ Description Leg Mode AND r/m16, imm8 B Valid Valid r/m16 AND imm8 (signextended). 83 /4 ib AND r/m32, imm8 B Valid Valid r/m32 AND imm8 (signextended). REX.W + 83 /4 ib AND r/m64, imm8 B Valid N.E. r/m64 AND imm8 (signextended). 20 /r AND r/m8, r8 A Valid Valid r/m8 AND r8. A Valid N.E. r/m64 AND r8 (signextended).
PAGE 18
Documentation Changes ANDPD—Bitwise Logical AND of Packed Double-Precision Floating-Point Values Opcode Instruction Op/ En 64-bit Mode Compat/ Description Leg Mode 66 0F 54 /r ANDPD xmm1, xmm2/m128 A Valid Valid Bitwise logical AND of xmm2/m128 and xmm1. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) NA NA ...
PAGE 19
Documentation Changes ANDNPS—Bitwise Logical AND NOT of Packed Single-Precision FloatingPoint Values Opcode Instruction Op/ En 64-bit Mode Compat/ Description Leg Mode 0F 55 /r ANDNPS xmm1, xmm2/m128 A Valid Valid Bitwise logical AND NOT of xmm2/m128 and xmm1. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) NA NA ...
PAGE 20
Documentation Changes See “Checking Caller Access Privileges” in Chapter 3, “Protected-Mode Memory Management,” of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A, for more information about the use of this instruction. ...
PAGE 21
Documentation Changes BLENDVPD — Variable Blend Packed Double Precision Floating-Point Values Opcode Instruction Op/ En 66 0F 38 15 /r BLENDVPD xmm1, A xmm2/m128 , 64-bit Mode Compat/ Description Leg Mode Valid Valid Select packed DP FP values from xmm1 and xmm2 from mask specified in XMM0 and store the values in xmm1. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) implicit XMM0 NA ...
PAGE 22
Documentation Changes BOUND—Check Array Index Against Bounds Opcode Instruction Op/ En 64-bit Mode Compat/ Description Leg Mode 62 /r BOUND r16, m16&16 A Invalid Valid Check if r16 (array index) is within bounds specified by m16&16. 62 /r BOUND r32, m32&32 A Invalid Valid Check if r32 (array index) is within bounds specified by m16&16. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r) ModRM:r/m (r) NA NA ...
PAGE 23
Documentation Changes BSWAP—Byte Swap Opcode Instruction Op/ En 64-bit Mode Compat/ Description Leg Mode 0F C8+rd BSWAP r32 A Valid* Valid Reverses the byte order of a 32-bit register. REX.W + 0F C8+rd BSWAP r64 A Valid N.E. Reverses the byte order of a 64-bit register. NOTES: * See IA-32 Architecture Compatibility section below. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A reg (r, w) NA NA NA ...
PAGE 24
Documentation Changes BTC—Bit Test and Complement Opcode Instruction Op/ En 64-bit Mode Compat/ Description Leg Mode 0F BB BTC r/m16, r16 A Valid Valid Store selected bit in CF flag and complement. 0F BB BTC r/m32, r32 A Valid Valid Store selected bit in CF flag and complement. REX.W + 0F BB BTC r/m64, r64 A Valid N.E. Store selected bit in CF flag and complement. 0F BA /7 ib BTC r/m16, imm8 B Valid Valid Store selected bit in CF flag and complement.
PAGE 25
Documentation Changes Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:r/m (r, w) ModRM:reg (r) NA NA B ModRM:r/m (r, w) imm8 NA NA ... BTS—Bit Test and Set Opcode Instruction Op/ En 64-bit Mode Compat/ Description Leg Mode 0F AB BTS r/m16, r16 A Valid Valid Store selected bit in CF flag and set. 0F AB BTS r/m32, r32 A Valid Valid Store selected bit in CF flag and set. REX.W + 0F AB BTS r/m64, r64 A Valid N.E.
PAGE 26
Documentation Changes Opcode Instruction Op/ En 64-bit Mode Compat/ Description Leg Mode FF /2 CALL r/m64 B Valid N.E. Call near, absolute indirect, address given in r/m64. 9A cd CALL ptr16:16 A Invalid Valid Call far, absolute, address given in operand. 9A cp CALL ptr16:32 A Invalid Valid Call far, absolute, address given in operand. FF /3 CALL m16:16 B Valid Valid Call far, absolute indirect address given in m16:16.
PAGE 27
Documentation Changes BW/CWDE/CDQE—Convert Byte to Word/Convert Word to Doubleword/ Convert Doubleword to Quadword Opcode Instruction Op/ En 64-bit Mode Compat/ Description Leg Mode 98 CBW A Valid Valid AX ← sign-extend of AL. 98 CWDE A Valid Valid EAX ← sign-extend of AX. REX.W + 98 CDQE A Valid N.E. RAX ← sign-extend of EAX. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A NA NA NA NA ...
PAGE 28
Documentation Changes CLFLUSH—Flush Cache Line Opcode Instruction Op/ En 64-bit Mode Compat/ Description Leg Mode 0F AE /7 CLFLUSH m8 A Valid Valid Flushes cache line containing m8. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:r/m (w) NA NA NA Description Invalidates the cache line that contains the linear address specified with the source operand from all levels of the processor cache hierarchy (data and instruction).
PAGE 29
Documentation Changes CLI — Clear Interrupt Flag Opcode Instruction Op/ En 64-bit Mode Compat/ Description Leg Mode FA CLI A Valid Valid Clear interrupt flag; interrupts disabled when interrupt flag cleared. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A NA NA NA NA ... CLTS—Clear Task-Switched Flag in CR0 Opcode Instruction Op/ En 64-bit Mode Compat/ Description Leg Mode 0F 06 CLTS A Valid Valid Clears TS flag in CR0.
PAGE 30
Documentation Changes CMOVcc—Conditional Move Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 47 /r CMOVA r16, r/m16 A Valid Valid Move if above (CF=0 and ZF=0). 0F 47 /r CMOVA r32, r/m32 A Valid Valid Move if above (CF=0 and ZF=0). REX.W + 0F 47 /r CMOVA r64, r/m64 A Valid N.E. Move if above (CF=0 and ZF=0). 0F 43 /r CMOVAE r16, r/m16 A Valid Valid Move if above or equal (CF=0). 0F 43 /r CMOVAE r32, r/m32 A Valid Valid Move if above or equal (CF=0).
PAGE 31
Documentation Changes Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 4C /r CMOVL r16, r/m16 A Valid Valid Move if less (SF≠ OF). 0F 4C /r CMOVL r32, r/m32 A Valid Valid Move if less (SF≠ OF). REX.W + 0F 4C /r CMOVL r64, r/m64 A Valid N.E. Move if less (SF≠ OF). 0F 4E /r CMOVLE r16, r/m16 A Valid Valid Move if less or equal (ZF=1 or SF≠ OF). 0F 4E /r CMOVLE r32, r/m32 A Valid Valid Move if less or equal (ZF=1 or SF≠ OF). REX.
PAGE 32
Documentation Changes Opcode Instruction 0F 4E /r Op/ En 64-Bit Mode Compat/ Description Leg Mode CMOVNG r32, r/m32 A Valid Valid Move if not greater (ZF=1 or SF≠ OF). REX.W + 0F 4E /r CMOVNG r64, r/m64 A Valid N.E. Move if not greater (ZF=1 or SF≠ OF). 0F 4C /r CMOVNGE r16, r/m16 A Valid Valid Move if not greater or equal (SF≠ OF). 0F 4C /r CMOVNGE r32, r/m32 A Valid Valid Move if not greater or equal (SF≠ OF). REX.W + 0F 4C /r CMOVNGE r64, r/m64 A Valid N.E.
PAGE 33
Documentation Changes Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 4A /r CMOVP r16, r/m16 A Valid Valid Move if parity (PF=1). 0F 4A /r CMOVP r32, r/m32 A Valid Valid Move if parity (PF=1). REX.W + 0F 4A /r CMOVP r64, r/m64 A Valid N.E. Move if parity (PF=1). 0F 4A /r CMOVPE r16, r/m16 A Valid Valid Move if parity even (PF=1). 0F 4A /r CMOVPE r32, r/m32 A Valid Valid Move if parity even (PF=1). REX.W + 0F 4A /r CMOVPE r64, r/m64 A Valid N.E.
PAGE 34
Documentation Changes Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 81 /7 id CMP r/m32, imm32 C Valid Valid Compare imm32 with r/m32. REX.W + 81 /7 id CMP r/m64, imm32 C Valid N.E. Compare imm32 signextended to 64-bits with r/m64. 83 /7 ib CMP r/m16, imm8 C Valid Valid Compare imm8 with r/m16. 83 /7 ib CMP r/m32, imm8 C Valid Valid Compare imm8 with r/m32. REX.W + 83 /7 ib CMP r/m64, imm8 C Valid N.E. Compare imm8 with r/m64.
PAGE 35
Documentation Changes CMPPD—Compare Packed Double-Precision Floating-Point Values Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 66 0F C2 /r ib CMPPD xmm1, xmm2/m128, imm8 A Valid Valid Compare packed doubleprecision floating-point values in xmm2/m128 and xmm1 using imm8 as comparison predicate. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) imm8 NA ...
PAGE 36
Documentation Changes CMPS/CMPSB/CMPSW/CMPSD/CMPSQ—Compare String Operands Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode A6 CMPS m8, m8 A Valid Valid For legacy mode, compare byte at address DS:(E)SI with byte at address ES:(E)DI; For 64-bit mode compare byte at address (R|E)SI to byte at address (R|E)DI. The status flags are set accordingly.
PAGE 37
Documentation Changes Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode A7 CMPSD A Valid Valid For legacy mode, compare dword at address DS:(E)SI with dword at address ES:(E)DI; For 64-bit mode compare dword at address (R|E)SI with dword at address (R|E)DI. The status flags are set accordingly. REX.W + A7 CMPSQ A Valid N.E. Compares quadword at address (R|E)SI with quadword at address (R|E)DI and sets the status flags accordingly.
PAGE 38
Documentation Changes CMPSS—Compare Scalar Single-Precision Floating-Point Values Opcode Instruction Op/ En F3 0F C2 /r ib CMPSS xmm1, A xmm2/m32, imm8 64-Bit Mode Compat/ Description Leg Mode Valid Valid Compare low singleprecision floating-point value in xmm2/m32 and xmm1 using imm8 as comparison predicate.
PAGE 39
Documentation Changes CMPXCHG—Compare and Exchange 64-Bit Mode Compat/ Description Leg Mode CMPXCHG r/m8, r8 A Valid Valid* Compare AL with r/m8. If equal, ZF is set and r8 is loaded into r/m8. Else, clear ZF and load r/m8 into AL. REX + 0F B0/r CMPXCHG r/m8**,r8 A Valid N.E. Compare AL with r/m8. If equal, ZF is set and r8 is loaded into r/m8. Else, clear ZF and load r/m8 into AL. 0F B1/r CMPXCHG r/m16, r16 A Valid Valid* Compare AX with r/m16.
PAGE 40
Documentation Changes CMPXCHG8B/CMPXCHG16B—Compare and Exchange Bytes Opcode Instruction 0F C7 /1 m64 REX.W + 0F C7 /1 m128 Op/ En 64-Bit Mode Compat/ Description Leg Mode CMPXCHG8B m64 A Valid Valid* Compare EDX:EAX with m64. If equal, set ZF and load ECX:EBX into m64. Else, clear ZF and load m64 into EDX:EAX. CMPXCHG16B m128 Valid N.E. Compare RDX:RAX with m128. If equal, set ZF and load RCX:RBX into m128. Else, clear ZF and load m128 into RDX:RAX.
PAGE 41
Documentation Changes COMISS—Compare Scalar Ordered Single-Precision Floating-Point Values and Set EFLAGS Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 2F /r COMISS xmm1, xmm2/m32 A Valid Valid Compare low singleprecision floating-point values in xmm1 and xmm2/mem32 and set the EFLAGS flags accordingly. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r) ModRM:r/m (r) NA NA ...
PAGE 42
Documentation Changes Table 3-20. Information Returned by CPUID Instruction (Continued) Initial EAX Value Information Provided about the Processor Basic CPUID Information ... 80000001H EAX Extended Processor Signature and Feature Bits.
PAGE 43
Documentation Changes CRC32 — Accumulate CRC32 Value Opcode Instruction Op/ En 64-Bit Mode F2 0F 38 F0 /r CRC32 r32, r/m8 A Valid Valid Accumulate CRC32 on r/m8. F2 REX 0F 38 F0 /r CRC32 r32, r/m8* A Valid N.E. Accumulate CRC32 on r/m8. F2 0F 38 F1 /r CRC32 r32, r/m16 A Valid Valid Accumulate CRC32 on r/m16. F2 0F 38 F1 /r CRC32 r32, r/m32 A Valid Valid Accumulate CRC32 on r/m32. A Valid N.E. Accumulate CRC32 on r/m8. F2 REX.W 0F 38 CRC32 r64, r/m64 A F1 /r Valid N.E.
PAGE 44
Documentation Changes CVTDQ2PS—Convert Packed Dword Integers to Packed Single-Precision FP Values Opcode Instruction Op/ En 0F 5B /r CVTDQ2PS xmm1, A xmm2/m128 64-Bit Mode Compat/ Description Leg Mode Valid Valid Convert four packed signed doubleword integers from xmm2/m128 to four packed single-precision floatingpoint values in xmm1. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (w) ModRM:r/m (r) NA NA ...
PAGE 45
Documentation Changes Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (w) ModRM:r/m (r) NA NA ... CVTPD2PS—Convert Packed Double-Precision FP Values to Packed SinglePrecision FP Values Opcode Instruction Op/ En 66 0F 5A /r CVTPD2PS xmm1, A xmm2/m128 64-Bit Mode Compat/ Description Leg Mode Valid Valid Convert two packed doubleprecision floating-point values in xmm2/m128 to two packed single-precision floating-point values in xmm1.
PAGE 46
Documentation Changes CVTPI2PS—Convert Packed Dword Integers to Packed Single-Precision FP Values Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 2A /r CVTPI2PS xmm, mm/m64 A Valid Valid Convert two signed doubleword integers from mm/m64 to two singleprecision floating-point values in xmm. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (w) ModRM:r/m (r) NA NA ...
PAGE 47
Documentation Changes Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (w) ModRM:r/m (r) NA NA ... CVTPS2PI—Convert Packed Single-Precision FP Values to Packed Dword Integers Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 2D /r CVTPS2PI mm, xmm/m64 A Valid Valid Convert two packed singleprecision floating-point values from xmm/m64 to two packed signed doubleword integers in mm.
PAGE 48
Documentation Changes CVTSD2SS—Convert Scalar Double-Precision FP Value to Scalar SinglePrecision FP Value Opcode Instruction Op/ En F2 0F 5A /r CVTSD2SS xmm1, A xmm2/m64 64-Bit Mode Compat/ Description Leg Mode Valid Valid Convert one doubleprecision floating-point value in xmm2/m64 to one single-precision floatingpoint value in xmm1. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (w) ModRM:r/m (r) NA NA ...
PAGE 49
Documentation Changes CVTSI2SS—Convert Dword Integer to Scalar Single-Precision FP Value Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode F3 0F 2A /r CVTSI2SS xmm, r/m32 A Valid Valid Convert one signed doubleword integer from r/m32 to one singleprecision floating-point value in xmm. F3 REX.W 0F 2A CVTSI2SS xmm, /r r/m64 A Valid N.E. Convert one signed quadword integer from r/m64 to one singleprecision floating-point value in xmm.
PAGE 50
Documentation Changes Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (w) ModRM:r/m (r) NA NA ... CVTTPD2DQ—Convert with Truncation Packed Double-Precision FP Values to Packed Dword Integers Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 66 0F E6 CVTTPD2DQ xmm1, xmm2/m128 A Valid Valid Convert two packed doubleprecision floating-point values from xmm2/m128 to two packed signed doubleword integers in xmm1 using truncation.
PAGE 51
Documentation Changes CVTTPS2DQ—Convert with Truncation Packed Single-Precision FP Values to Packed Dword Integers Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode F3 0F 5B /r CVTTPS2DQ xmm1, xmm2/m128 A Valid Valid Convert four singleprecision floating-point values from xmm2/m128 to four signed doubleword integers in xmm1 using truncation. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (w) ModRM:r/m (r) NA NA ...
PAGE 52
Documentation Changes CVTTSD2SI—Convert with Truncation Scalar Double-Precision FP Value to Signed Integer Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode F2 0F 2C /r CVTTSD2SI r32, xmm/m64 A Valid Valid Convert one doubleprecision floating-point value from xmm/m64 to one signed doubleword integer in r32 using truncation. F2 REX.W 0F 2C CVTTSD2SI r64, /r xmm/m64 A Valid N.E.
PAGE 53
Documentation Changes CWD/CDQ/CQO—Convert Word to Doubleword/Convert Doubleword to Quadword Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 99 CWD A Valid Valid DX:AX ← sign-extend of AX. 99 CDQ A Valid Valid EDX:EAX ← sign-extend of EAX. REX.W + 99 CQO A Valid N.E. RDX:RAX← sign-extend of RAX. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A NA NA NA NA ...
PAGE 54
Documentation Changes DEC—Decrement by 1 Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode FE /1 DEC r/m8 A Valid Valid Decrement r/m8 by 1. REX + FE /1 DEC r/m8 * A Valid N.E. Decrement r/m8 by 1. FF /1 DEC r/m16 A Valid Valid Decrement r/m16 by 1. FF /1 DEC r/m32 A Valid Valid Decrement r/m32 by 1. REX.W + FF /1 DEC r/m64 A Valid N.E. Decrement r/m64 by 1. 48+rw DEC r16 B N.E. Valid Decrement r16 by 1. 48+rd DEC r32 B N.E.
PAGE 55
Documentation Changes Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:r/m (w) NA NA NA ... DIVPD—Divide Packed Double-Precision Floating-Point Values Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 66 0F 5E /r DIVPD xmm1, xmm2/m128 A Valid Valid Divide packed doubleprecision floating-point values in xmm1 by packed double-precision floatingpoint values xmm2/m128.
PAGE 56
Documentation Changes Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) NA NA ... DIVSS—Divide Scalar Single-Precision Floating-Point Values Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode F3 0F 5E /r DIVSS xmm1, xmm2/m32 A Valid Valid Divide low single-precision floating-point value in xmm1 by low singleprecision floating-point value in xmm2/m32.
PAGE 57
Documentation Changes DPPS — Dot Product of Packed Single Precision Floating-Point Values Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 66 0F 3A 40 /r ib DPPS xmm1, xmm2/m128, imm8 A Valid Valid Selectively multiply packed SP floating-point values from xmm1 with packed SP floating-point values from xmm2, add and selectively store the packed SP floating-point values or zero values to xmm1.
PAGE 58
Documentation Changes EXTRACTPS — Extract Packed Single Precision Floating-Point Value Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 66 0F 3A 17 EXTRACTPS reg/m32, xmm2, imm8 A Valid Valid /r ib Extract a single-precision floating-point value from xmm2 at the source offset specified by imm8 and store the result to reg or m32. The upper 32 bits of r64 is zeroed if reg is r64.
PAGE 59
Documentation Changes FXRSTOR—Restore x87 FPU, MMX , XMM, and MXCSR State Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F AE /1 FXRSTOR m512byte A Valid Valid Restore the x87 FPU, MMX, XMM, and MXCSR register state from m512byte. A Valid N.E. Restore the x87 FPU, MMX, XMM, and MXCSR register state from m512byte. REX.W+ 0F AE / FXRSTOR64 1 m512byte Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:r/m (r) NA NA NA ...
PAGE 60
Documentation Changes HADDPS—Packed Single-FP Horizontal Add Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode F2 0F 7C /r HADDPS xmm1, xmm2/m128 A Valid Valid Horizontal add packed single-precision floatingpoint values from xmm2/m128 to xmm1. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) NA NA ...
PAGE 61
Documentation Changes Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) NA NA ... IDIV—Signed Divide Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode F6 /7 IDIV r/m8 A Valid Valid Signed divide AX by r/m8, with result stored in: AL ← Quotient, AH ← Remainder. REX + F6 /7 IDIV r/m8* A Valid N.E. Signed divide AX by r/m8, with result stored in AL ← Quotient, AH ← Remainder.
PAGE 62
Documentation Changes IMUL—Signed Multiply Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode F6 /5 IMUL r/m8* A Valid Valid AX← AL ∗ r/m byte. F7 /5 IMUL r/m16 A Valid Valid DX:AX ← AX ∗ r/m word. F7 /5 IMUL r/m32 A Valid Valid EDX:EAX ← EAX ∗ r/m32. REX.W + F7 /5 IMUL r/m64 A Valid N.E. RDX:RAX ← RAX ∗ r/m64. 0F AF /r IMUL r16, r/m16 B Valid Valid word register ← word register ∗ r/m16.
PAGE 63
Documentation Changes IN—Input from Port Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode E4 ib IN AL, imm8 A Valid Valid Input byte from imm8 I/O port address into AL. E5 ib IN AX, imm8 A Valid Valid Input word from imm8 I/O port address into AX. E5 ib IN EAX, imm8 A Valid Valid Input dword from imm8 I/O port address into EAX. EC IN AL,DX B Valid Valid Input byte from I/O port in DX into AL.
PAGE 64
Documentation Changes Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:r/m (r, w) NA NA NA B reg (r, w) NA NA NA ... INS/INSB/INSW/INSD—Input from Port to String Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 6C INS m8, DX A Valid Valid Input byte from I/O port specified in DX into memory location specified in ES:(E)DI or RDI.
PAGE 65
Documentation Changes INSERTPS — Insert Packed Single Precision Floating-Point Value Opcode Instruction Op/ En 66 0F 3A 21 /r ib INSERTPS xmm1, A xmm2/m32, imm8 64-Bit Mode Compat/ Description Leg Mode Valid Valid Insert a single precision floating-point value selected by imm8 from xmm2/m32 into xmm1 at the specified destination element specified by imm8 and zero out destination elements in xmm1 as indicated in imm8.
PAGE 66
Documentation Changes IF (VM = 1 and IOPL < 3 AND INT n) THEN #GP(0); ELSE (* Protected mode, IA-32e mode, or virtual-8086 mode interrupt *) IF (IA32_EFER.
PAGE 67
Documentation Changes FI; IF software interrupt (* Generated by INT n, INT 3, but not INTO *) THEN IF gate descriptor DPL < CPL THEN #GP((vector_number « 3) + 2 ); (* PE = 1, DPL < CPL, software interrupt *) FI; ELSE (* Generated by INTO *) #UD; FI; IF gate not present THEN #NP((vector_number « 3) + 2 + EXT); FI; IF ((vector_number * 16)[IST] ≠ 0) NewRSP ← TSS[ISTx]; FI; GOTO TRAP-OR-INTERRUPT-GATE; (* Trap/interrupt gate *) END; ...
PAGE 68
Documentation Changes INVD—Invalidate Internal Caches Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 08 INVD A Valid Valid Flush internal caches; initiate flushing of external caches. NOTES: * See the IA-32 Architecture Compatibility section below. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A NA NA NA NA ...
PAGE 69
Documentation Changes Jcc—Jump if Condition Is Met Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 77 cb JA rel8 A Valid Valid Jump short if above (CF=0 and ZF=0). 73 cb JAE rel8 A Valid Valid Jump short if above or equal (CF=0). 72 cb JB rel8 A Valid Valid Jump short if below (CF=1). 76 cb JBE rel8 A Valid Valid Jump short if below or equal (CF=1 or ZF=1). 72 cb JC rel8 A Valid Valid Jump short if carry (CF=1). E3 cb JCXZ rel8 A N.E.
PAGE 70
Documentation Changes Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 71 cb JNO rel8 A Valid Valid Jump short if not overflow (OF=0). 7B cb JNP rel8 A Valid Valid Jump short if not parity (PF=0). 79 cb JNS rel8 A Valid Valid Jump short if not sign (SF=0). 75 cb JNZ rel8 A Valid Valid Jump short if not zero (ZF=0). 70 cb JO rel8 A Valid Valid Jump short if overflow (OF=1). 7A cb JP rel8 A Valid Valid Jump short if parity (PF=1).
PAGE 71
Documentation Changes Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 84 cd JE rel32 A Valid Valid Jump near if equal (ZF=1). 0F 84 cw JZ rel16 A N.S. Valid Jump near if 0 (ZF=1). Not supported in 64-bit mode. 0F 84 cd JZ rel32 A Valid Valid Jump near if 0 (ZF=1). 0F 8F cw JG rel16 A N.S. Valid Jump near if greater (ZF=0 and SF=OF). Not supported in 64-bit mode. 0F 8F cd JG rel32 A Valid Valid Jump near if greater (ZF=0 and SF=OF).
PAGE 72
Documentation Changes Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 87 cd JNBE rel32 A Valid Valid Jump near if not below or equal (CF=0 and ZF=0). 0F 83 cw JNC rel16 A N.S. Valid Jump near if not carry (CF=0). Not supported in 64bit mode. 0F 83 cd JNC rel32 A Valid Valid Jump near if not carry (CF=0). 0F 85 cw JNE rel16 A N.S. Valid Jump near if not equal (ZF=0). Not supported in 64-bit mode.
PAGE 73
Documentation Changes Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 89 cw JNS rel16 A N.S. Valid Jump near if not sign (SF=0). Not supported in 64-bit mode. 0F 89 cd JNS rel32 A Valid Valid Jump near if not sign (SF=0). 0F 85 cw JNZ rel16 A N.S. Valid Jump near if not zero (ZF=0). Not supported in 64-bit mode. 0F 85 cd JNZ rel32 A Valid Valid Jump near if not zero (ZF=0). 0F 80 cw JO rel16 A N.S. Valid Jump near if overflow (OF=1).
PAGE 74
Documentation Changes JMP—Jump Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode EB cb JMP rel8 A Valid Valid Jump short, RIP = RIP + 8-bit displacement sign extended to 64-bits E9 cw JMP rel16 A N.S. Valid Jump near, relative, displacement relative to next instruction. Not supported in 64-bit mode. E9 cd JMP rel32 A Valid Valid Jump near, relative, RIP = RIP + 32-bit displacement sign extended to 64-bits FF /4 JMP r/m16 B N.S.
PAGE 75
Documentation Changes LAHF—Load Status Flags into AH Register Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 9F LAHF A Invalid* Valid Load: AH ← EFLAGS(SF:ZF:0:AF:0:PF:1:CF). NOTES: *Valid in specific steppings. See Description section. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A NA NA NA NA ... LAR—Load Access Rights Byte Opcode Instruction 0F 02 /r 0F 02 /r REX.
PAGE 76
Documentation Changes LDDQU—Load Unaligned Integer 128 Bits Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode F2 0F F0 /r LDDQU xmm1, mem A Valid Valid Load unaligned data from mem and return double quadword in xmm1. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (w) ModRM:r/m (r) NA NA ...
PAGE 77
Documentation Changes LDS/LES/LFS/LGS/LSS—Load Far Pointer Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode C5 /r LDS r16,m16:16 A Invalid Valid Load DS:r16 with far pointer from memory. C5 /r LDS r32,m16:32 A Invalid Valid Load DS:r32 with far pointer from memory. 0F B2 /r LSS r16,m16:16 A Valid Valid Load SS:r16 with far pointer from memory. 0F B2 /r LSS r32,m16:32 A Valid Valid Load SS:r32 with far pointer from memory.
PAGE 78
Documentation Changes LEA—Load Effective Address Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 8D /r LEA r16,m A Valid Valid Store effective address for m in register r16. 8D /r LEA r32,m A Valid Valid Store effective address for m in register r32. REX.W + 8D /r LEA r64,m A Valid N.E. Store effective address for m in register r64. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (w) ModRM:r/m (r) NA NA ...
PAGE 79
Documentation Changes LGDT/LIDT—Load Global/Interrupt Descriptor Table Register Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 01 /2 LGDT m16&32 A N.E. Valid Load m into GDTR. 0F 01 /3 LIDT m16&32 A N.E. Valid Load m into IDTR. 0F 01 /2 LGDT m16&64 A Valid N.E. Load m into GDTR. 0F 01 /3 LIDT m16&64 A Valid N.E. Load m into IDTR. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:r/m (r) NA NA NA ...
PAGE 80
Documentation Changes LOCK—Assert LOCK# Signal Prefix Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode F0 LOCK A Valid Valid Asserts LOCK# signal for duration of the accompanying instruction. NOTES: * See IA-32 Architecture Compatibility section below.
PAGE 81
Documentation Changes LODS/LODSB/LODSW/LODSD/LODSQ—Load String Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode AC LODS m8 A Valid Valid For legacy mode, Load byte at address DS:(E)SI into AL. For 64-bit mode load byte at address (R)SI into AL. AD LODS m16 A Valid Valid For legacy mode, Load word at address DS:(E)SI into AX. For 64-bit mode load word at address (R)SI into AX. AD LODS m32 A Valid Valid For legacy mode, Load dword at address DS:(E)SI into EAX.
PAGE 82
Documentation Changes LOOP/LOOPcc—Loop According to ECX Counter Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode E2 cb LOOP rel8 A Valid Valid Decrement count; jump short if count ≠ 0. E1 cb LOOPE rel8 A Valid Valid Decrement count; jump short if count ≠ 0 and ZF = 1. E0 cb LOOPNE rel8 A Valid Valid Decrement count; jump short if count ≠ 0 and ZF = 0. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A Offset NA NA NA ...
PAGE 83
Documentation Changes LTR—Load Task Register Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 00 /3 LTR r/m16 A Valid Valid Load r/m16 into task register. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:r/m (r) NA NA NA ...
PAGE 84
Documentation Changes MASKMOVQ—Store Selected Bytes of Quadword Opcode Instruction Op/ En 0F F7 /r MASKMOVQ mm1, A mm2 64-Bit Mode Compat/ Description Leg Mode Valid Valid Selectively write bytes from mm1 to memory location using the byte mask in mm2. The default memory location is specified by DS:EDI. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r) ModRM:r/m (r) NA NA ...
PAGE 85
Documentation Changes MAXSD—Return Maximum Scalar Double-Precision Floating-Point Value Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode F2 0F 5F /r MAXSD xmm1, xmm2/m64 A Valid Valid Return the maximum scalar double-precision floatingpoint value between xmm2/mem64 and xmm1. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) NA NA ...
PAGE 86
Documentation Changes any serializing instructions (such as the CPUID instruction). MFENCE does not serialize the instruction stream. Weakly ordered memory types can be used to achieve higher processor performance through such techniques as out-of-order issue, speculative reads, write-combining, and write-collapsing. The degree to which a consumer of data recognizes or knows that the data is weakly ordered varies among applications and may be unknown to the producer of this data.
PAGE 87
Documentation Changes MINPS—Return Minimum Packed Single-Precision Floating-Point Values Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 5D /r MINPS xmm1, xmm2/m128 A Valid Valid Return the minimum singleprecision floating-point values between xmm2/m128 and xmm1. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) NA NA ...
PAGE 88
Documentation Changes MONITOR—Set Up Monitor Address Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 01 C8 MONITOR A Valid Valid Sets up a linear address range to be monitored by hardware and activates the monitor. The address range should be a write-back memory caching type. The address is DS:EAX (DS:RAX in 64-bit mode). Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A NA NA NA NA ...
PAGE 89
Documentation Changes Opcode Instruction A1 64-Bit Mode Compat/ Description Leg Mode MOV AX,moffs16* C Valid Valid Move word at (seg:offset) to AX. A1 MOV EAX,moffs32* C Valid Valid Move doubleword at (seg:offset) to EAX. REX.W + A1 MOV RAX,moffs64* C Valid N.E. Move quadword at (offset) to RAX. A2 MOV moffs8,AL D Valid Valid Move AL to (seg:offset). ,AL D Valid N.E. Move AL to (offset). REX.
PAGE 90
Documentation Changes Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:r/m (w) ModRM:reg (r) NA NA B ModRM:reg (w) ModRM:r/m (r) NA NA C AL/AX/EAX/RAX Displacement NA NA D Displacement AL/AX/EAX/RAX NA NA E reg (w) imm8/16/32/64 NA NA F ModRM:r/m (w) imm8/16/32/64 NA NA ... MOV—Move to/from Control Registers Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 20/r MOV r32, CR0– CR7 A N.E.
PAGE 91
Documentation Changes MOV—Move to/from Debug Registers Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 21/r MOV r32, DR0– DR7 A N.E. Valid Move debug register to r32 0F 21/r MOV r64, DR0– DR7 A Valid N.E. Move extended debug register to r64. 0F 23 /r MOV DR0–DR7, r32 A N.E. Valid Move r32 to debug register 0F 23 /r MOV DR0–DR7, r64 A Valid N.E. Move r64 to extended debug register.
PAGE 92
Documentation Changes MOVAPS—Move Aligned Packed Single-Precision Floating-Point Values Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 28 /r MOVAPS xmm1, xmm2/m128 A Valid Valid Move packed singleprecision floating-point values from xmm2/m128 to xmm1. 0F 29 /r MOVAPS xmm2/m128, xmm1 B Valid Valid Move packed singleprecision floating-point values from xmm1 to xmm2/m128.
PAGE 93
Documentation Changes MOVD/MOVQ—Move Doubleword/Move Quadword Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 6E /r MOVD mm, r/m32 A Valid Valid Move doubleword from r/m32 to mm. REX.W + 0F 6E /r MOVQ mm, r/m64 A Valid N.E. Move quadword from r/m64 to mm. 0F 7E /r MOVD r/m32, mm B Valid Valid Move doubleword from mm to r/m32. REX.W + 0F 7E /r MOVQ r/m64, mm B Valid N.E. Move quadword from mm to r/m64.
PAGE 94
Documentation Changes MOVDQA—Move Aligned Double Quadword Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 66 0F 6F /r MOVDQA xmm1, xmm2/m128 A Valid Valid Move aligned double quadword from xmm2/m128 to xmm1. 66 0F 7F /r MOVDQA xmm2/m128, xmm1 B Valid Valid Move aligned double quadword from xmm1 to xmm2/m128. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (w) ModRM:r/m (r) NA NA B ModRM:r/m (w) ModRM:reg (r) NA NA ...
PAGE 95
Documentation Changes MOVHLPS— Move Packed Single-Precision Floating-Point Values High to Low Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 12 /r MOVHLPS xmm1, xmm2 A Valid Valid Move two packed singleprecision floating-point values from high quadword of xmm2 to low quadword of xmm1. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (w) ModRM:reg (r) NA NA ...
PAGE 96
Documentation Changes MOVHPS—Move High Packed Single-Precision Floating-Point Values Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 16 /r MOVHPS xmm, m64 A Valid Valid Move two packed singleprecision floating-point values from m64 to high quadword of xmm. 0F 17 /r MOVHPS m64, xmm B Valid Valid Move two packed singleprecision floating-point values from high quadword of xmm to m64.
PAGE 97
Documentation Changes MOVLPD—Move Low Packed Double-Precision Floating-Point Value Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 66 0F 12 /r MOVLPD xmm, m64 A Valid Valid Move double-precision floating-point value from m64 to low quadword of xmm register. 66 0F 13 /r MOVLPD m64, xmm B Valid Valid Move double-precision floating-point nvalue from low quadword of xmm register to m64.
PAGE 98
Documentation Changes MOVMSKPD—Extract Packed Double-Precision Floating-Point Sign Mask Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 66 0F 50 /r MOVMSKPD reg, xmm A Valid Valid Extract 2-bit sign mask from xmm and store in reg. The upper bits of r32 or r64 are filled with zeros. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (w) ModRM:reg (r) NA NA ...
PAGE 99
Documentation Changes MOVNTDQ—Store Double Quadword Using Non-Temporal Hint Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 66 0F E7 /r MOVNTDQ m128, xmm A Valid Valid Move double quadword from xmm to m128 using non-temporal hint. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:r/m (w) ModRM:reg (r) NA NA ... MOVNTI—Store Doubleword Using Non-Temporal Hint Opcode Instruction 0F C3 /r REX.
PAGE 100
Documentation Changes MOVNTPS—Store Packed Single-Precision Floating-Point Values Using Non-Temporal Hint Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 2B /r MOVNTPS m128, xmm A Valid Valid Move packed singleprecision floating-point values from xmm to m128 using non-temporal hint. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:r/m (w) ModRM:reg (r) NA NA ...
PAGE 101
Documentation Changes MOVQ—Move Quadword Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 6F /r MOVQ mm, mm/m64 A Valid Valid Move quadword from mm/m64 to mm. 0F 7F /r MOVQ mm/m64, mm B Valid Valid Move quadword from mm to mm/m64. F3 0F 7E MOVQ xmm1, xmm2/m64 A Valid Valid Move quadword from xmm2/mem64 to xmm1. 66 0F D6 MOVQ xmm2/m64, xmm1 B Valid Valid Move quadword from xmm1 to xmm2/mem64.
PAGE 102
Documentation Changes MOVS/MOVSB/MOVSW/MOVSD/MOVSQ—Move Data from String to String \ Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode A4 MOVS m8, m8 A Valid Valid For legacy mode, Move byte from address DS:(E)SI to ES:(E)DI. For 64-bit mode move byte from address (R|E)SI to (R|E)DI. A5 MOVS m16, m16 A Valid Valid For legacy mode, move word from address DS:(E)SI to ES:(E)DI. For 64-bit mode move word at address (R|E)SI to (R|E)DI.
PAGE 103
Documentation Changes MOVSD—Move Scalar Double-Precision Floating-Point Value Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode F2 0F 10 /r MOVSD xmm1, xmm2/m64 A Valid Valid Move scalar doubleprecision floating-point value from xmm2/m64 to xmm1 register. F2 0F 11 /r MOVSD xmm2/m64, xmm1 B Valid Valid Move scalar doubleprecision floating-point value from xmm1 register to xmm2/m64.
PAGE 104
Documentation Changes MOVSLDUP—Move Packed Single-FP Low and Duplicate Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode F3 0F 12 /r MOVSLDUP xmm1, xmm2/m128 A Valid Valid Move two single-precision floating-point values from the lower 32-bit operand of each qword in xmm2/m128 to xmm1 and duplicate each 32-bit operand to the higher 32-bits of each qword. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (w) ModRM:r/m (r) NA NA ...
PAGE 105
Documentation Changes MOVSX/MOVSXD—Move with Sign-Extension Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F BE /r MOVSX r16, r/m8 A Valid Valid Move byte to word with sign-extension. 0F BE /r MOVSX r32, r/m8 A Valid Valid Move byte to doubleword with sign-extension. REX + 0F BE /r MOVSX r64, r/m8* A Valid N.E. Move byte to quadword with sign-extension. 0F BF /r MOVSX r32, r/m16 A Valid Valid Move word to doubleword, with sign-extension. REX.
PAGE 106
Documentation Changes MOVUPD—Move Unaligned Packed Double-Precision Floating-Point Values Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 66 0F 10 /r MOVUPD xmm1, xmm2/m128 A Valid Valid Move packed doubleprecision floating-point values from xmm2/m128 to xmm1. 66 0F 11 /r MOVUPD xmm2/m128, xmm B Valid Valid Move packed doubleprecision floating-point values from xmm1 to xmm2/m128.
PAGE 107
Documentation Changes MOVZX—Move with Zero-Extend Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F B6 /r MOVZX r16, r/m8 A Valid Valid Move byte to word with zero-extension. 0F B6 /r MOVZX r32, r/m8 A Valid Valid Move byte to doubleword, zero-extension. REX.W + 0F B6 /r MOVZX r64, r/m8* A Valid N.E. Move byte to quadword, zero-extension. 0F B7 /r MOVZX r32, r/m16 A Valid Valid Move word to doubleword, zero-extension. REX.
PAGE 108
Documentation Changes MUL—Unsigned Multiply Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode F6 /4 MUL r/m8 A Valid Valid Unsigned multiply (AX ← AL ∗ r/m8). REX + F6 /4 MUL r/m8* A Valid N.E. Unsigned multiply (AX ← AL ∗ r/m8). F7 /4 MUL r/m16 A Valid Valid Unsigned multiply (DX:AX ← AX ∗ r/m16). F7 /4 MUL r/m32 A Valid Valid Unsigned multiply (EDX:EAX ← EAX ∗ r/m32). REX.W + F7 /4 MUL r/m64 A Valid N.E. Unsigned multiply (RDX:RAX ← RAX ∗ r/m64.
PAGE 109
Documentation Changes MULPS—Multiply Packed Single-Precision Floating-Point Values Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 59 /r MULPS xmm1, xmm2/m128 A Valid Valid Multiply packed singleprecision floating-point values in xmm2/mem by xmm1. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) NA NA ...
PAGE 110
Documentation Changes MWAIT—Monitor Wait Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 01 C9 MWAIT A Valid Valid A hint that allow the processor to stop instruction execution and enter an implementationdependent optimized state until occurrence of a class of events. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A NA NA NA NA ... 2.
PAGE 111
Documentation Changes Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:r/m (r, w) NA NA NA ... NOP—No Operation Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 90 NOP A Valid Valid One byte no-operation instruction. 0F 1F /0 NOP r/m16 B Valid Valid Multi-byte no-operation instruction. 0F 1F /0 NOP r/m32 B Valid Valid Multi-byte no-operation instruction.
PAGE 112
Documentation Changes OR—Logical Inclusive OR Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0C ib OR AL, imm8 A Valid Valid AL OR imm8. 0D iw OR AX, imm16 A Valid Valid AX OR imm16. 0D id OR EAX, imm32 A Valid Valid EAX OR imm32. REX.W + 0D id OR RAX, imm32 A Valid N.E. RAX OR imm32 (signextended). 80 /1 ib OR r/m8, imm8 B Valid Valid r/m8 OR imm8. REX + 80 /1 ib OR r/m8*, imm8 B Valid N.E. r/m8 OR imm8.
PAGE 113
Documentation Changes ORPD—Bitwise Logical OR of Double-Precision Floating-Point Values Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 66 0F 56 /r ORPD xmm1, xmm2/m128 A Valid Valid Bitwise OR of xmm2/m128 and xmm1. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) NA NA ...
PAGE 114
Documentation Changes Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A imm8 NA NA NA B NA NA NA NA ... IA-32 Architecture Compatibility After executing an OUT instruction, the Pentium® processor ensures that the EWBE# pin has been sampled active before it begins to execute the next instruction. (Note that the instruction can be prefetched if EWBE# is not active, but it will not be executed until the EWBE# pin is sampled active.
PAGE 115
Documentation Changes Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A NA NA NA NA ... IA-32 Architecture Compatibility After executing an OUTS, OUTSB, OUTSW, or OUTSD instruction, the Pentium processor ensures that the EWBE# pin has been sampled active before it begins to execute the next instruction. (Note that the instruction can be prefetched if EWBE# is not active, but it will not be executed until the EWBE# pin is sampled active.
PAGE 116
Documentation Changes PACKSSWB/PACKSSDW—Pack with Signed Saturation Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 63 /r PACKSSWB mm1, mm2/m64 A Valid Valid Converts 4 packed signed word integers from mm1 and from mm2/m64 into 8 packed signed byte integers in mm1 using signed saturation.
PAGE 117
Documentation Changes PACKUSDW — Pack with Unsigned Saturation Opcode Instruction Op/ En 66 0F 38 2B /r PACKUSDW xmm1, A xmm2/m128 64-Bit Mode Compat/ Description Leg Mode Valid Valid Convert 4 packed signed doubleword integers from xmm1 and 4 packed signed doubleword integers from xmm2/m128 into 8 packed unsigned word integers in xmm1 using unsigned saturation. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) NA NA ...
PAGE 118
Documentation Changes PADDB/PADDW/PADDD—Add Packed Integers Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F FC /r PADDB mm, mm/m64 A Valid Valid Add packed byte integers from mm/m64 and mm. 66 0F FC /r PADDB xmm1, xmm2/m128 A Valid Valid Add packed byte integers from xmm2/m128 and xmm1. 0F FD /r PADDW mm, mm/m64 A Valid Valid Add packed word integers from mm/m64 and mm.
PAGE 119
Documentation Changes PADDSB/PADDSW—Add Packed Signed Integers with Signed Saturation Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F EC /r PADDSB mm, mm/m64 A Valid Valid Add packed signed byte integers from mm/m64 and mm and saturate the results. 66 0F EC /r PADDSB xmm1, xmm2/m128 A Valid Valid Add packed signed byte integers from xmm2/m128 and xmm1 saturate the results.
PAGE 120
Documentation Changes Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) NA NA ... PALIGNR — Packed Align Right Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 3A 0F PALIGNR mm1, mm2/m64, imm8 A Valid Valid Concatenate destination and source operands, extract byte-aligned result shifted to the right by constant value in imm8 into mm1.
PAGE 121
Documentation Changes PANDN—Logical AND NOT Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F DF /r PANDN mm, mm/m64 A Valid Valid Bitwise AND NOT of mm/m64 and mm. 66 0F DF /r PANDN xmm1, xmm2/m128 A Valid Valid Bitwise AND NOT of xmm2/m128 and xmm1. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) NA NA ...
PAGE 122
Documentation Changes Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) NA NA ... PBLENDVB — Variable Blend Packed Bytes Opcode Instruction Op/ En 66 0F 38 10 /r PBLENDVB xmm1, A xmm2/m128, 64-Bit Mode Compat/ Description Leg Mode Valid Valid Select byte values from xmm1 and xmm2/m128 from mask specified in the high bit of each byte in XMM0 and store the values into xmm1.
PAGE 123
Documentation Changes PCMPEQB/PCMPEQW/PCMPEQD— Compare Packed Data for Equal Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 74 /r PCMPEQB mm, mm/m64 A Valid Valid Compare packed bytes in mm/m64 and mm for equality. 66 0F 74 /r PCMPEQB xmm1, xmm2/m128 A Valid Valid Compare packed bytes in xmm2/m128 and xmm1 for equality. 0F 75 /r PCMPEQW mm, mm/m64 A Valid Valid Compare packed words in mm/m64 and mm for equality.
PAGE 124
Documentation Changes PCMPESTRI — Packed Compare Explicit Length Strings, Return Index Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 66 0F 3A 61 /r imm8 PCMPESTRI xmm1, xmm2/m128, imm8 A Valid Valid Perform a packed comparison of string data with explicit lengths, generating an index, and storing the result in ECX. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r) ModRM:r/m (r) imm8 NA ...
PAGE 125
Documentation Changes PCMPISTRM — Packed Compare Implicit Length Strings, Return Mask Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 66 0F 3A 62 /r imm8 PCMPISTRM xmm1, xmm2/m128, imm8 A Valid Valid Perform a packed comparison of string data with implicit lengths, generating a mask, and storing the result in XMM0. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r) ModRM:r/m (r) imm8 NA ...
PAGE 126
Documentation Changes PCMPGTQ — Compare Packed Data for Greater Than Opcode Instruction Op/ En 66 0F 38 37 /r PCMPGTQ A xmm1,xmm2/m12 8 64-Bit Mode Compat/ Description Leg Mode Valid Valid Compare packed qwords in xmm2/m128 and xmm1 for greater than. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) NA NA ...
PAGE 127
Documentation Changes In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15, R8-15). PEXTRQ requires REX.W. If the destination operand is a general-purpose register, the default operand size of PEXTRB/ PEXTRW is 64 bits. ...
PAGE 128
Documentation Changes Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) NA NA ... PHADDSW — Packed Horizontal Add and Saturate Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 38 03 /r PHADDSW mm1, mm2/m64 A Valid Valid Add 16-bit signed integers horizontally, pack saturated integers to MM1.
PAGE 129
Documentation Changes PHSUBW/PHSUBD — Packed Horizontal Subtract Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 38 05 /r PHSUBW mm1, mm2/m64 A Valid Valid Subtract 16-bit signed integers horizontally, pack to MM1. 66 0F 38 05 /r PHSUBW xmm1, xmm2/m128 A Valid Valid Subtract 16-bit signed integers horizontally, pack to XMM1. 0F 38 06 /r PHSUBD mm1, mm2/m64 A Valid Valid Subtract 32-bit signed integers horizontally, pack to MM1.
PAGE 130
Documentation Changes PINSRB/PINSRD/PINSRQ — Insert Byte/Dword/Qword Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 66 0F 3A 20 /r ib PINSRB xmm1, r32/m8, imm8 A Valid Valid Insert a byte integer value from r32/m8 into xmm1 at the destination element in xmm1 specified by imm8. 66 0F 3A 22 /r ib PINSRD xmm1, r/m32, imm8 A Valid Valid Insert a dword integer value from r/m32 into the xmm1 at the destination element specified by imm8. 66 REX.
PAGE 131
Documentation Changes PMADDUBSW — Multiply and Add Packed Signed and Unsigned Bytes Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 38 04 /r PMADDUBSW mm1, mm2/m64 A Valid Valid Multiply signed and unsigned bytes, add horizontal pair of signed words, pack saturated signed-words to MM1. 66 0F 38 04 /r PMADDUBSW xmm1, xmm2/m128 A Valid Valid Multiply signed and unsigned bytes, add horizontal pair of signed words, pack saturated signed-words to XMM1.
PAGE 132
Documentation Changes PMAXSB — Maximum of Packed Signed Byte Integers Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 66 0F 38 3C /r PMAXSB xmm1, xmm2/m128 A Valid Valid Compare packed signed byte integers in xmm1 and xmm2/m128 and store packed maximum values in xmm1. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) NA NA ...
PAGE 133
Documentation Changes Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) NA NA ... PMAXUB—Maximum of Packed Unsigned Byte Integers Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F DE /r PMAXUB mm1, mm2/m64 A Valid Valid Compare unsigned byte integers in mm2/m64 and mm1 and returns maximum values.
PAGE 134
Documentation Changes PMAXUW — Maximum of Packed Word Integers Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 66 0F 38 3E /r PMAXUW xmm1, xmm2/m128 A Valid Valid Compare packed unsigned word integers in xmm1 and xmm2/m128 and store packed maximum values in xmm1. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) NA NA ...
PAGE 135
Documentation Changes PMINSW—Minimum of Packed Signed Word Integers Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F EA /r PMINSW mm1, mm2/m64 A Valid Valid Compare signed word integers in mm2/m64 and mm1 and return minimum values. 66 0F EA /r PMINSW xmm1, xmm2/m128 A Valid Valid Compare signed word integers in xmm2/m128 and xmm1 and return minimum values.
PAGE 136
Documentation Changes PMINUD — Minimum of Packed Dword Integers Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 66 0F 38 3B /r PMINUD xmm1, xmm2/m128 A Valid Valid Compare packed unsigned dword integers in xmm1 and xmm2/m128 and store packed minimum values in xmm1. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) NA NA ...
PAGE 137
Documentation Changes PMINUW — Minimum of Packed Word Integers Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 66 0F 38 3A /r PMINUW xmm1, xmm2/m128 A Valid Valid Compare packed unsigned word integers in xmm1 and xmm2/m128 and store packed minimum values in xmm1. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) NA NA ...
PAGE 138
Documentation Changes PMOVSX — Packed Move with Sign Extend Opcode Instruction Op/ En 64-bit Mode Compat/ Leg Mode Description 66 0f 38 20 /r PMOVSXBW xmm1, xmm2/m64 A Valid Valid Sign extend 8 packed signed 8-bit integers in the low 8 bytes of xmm2/m64 to 8 packed signed 16-bit integers in xmm1. 66 0f 38 21 /r PMOVSXBD xmm1, xmm2/m32 A Valid Valid Sign extend 4 packed signed 8-bit integers in the low 4 bytes of xmm2/m32 to 4 packed signed 32-bit integers in xmm1.
PAGE 139
Documentation Changes PMOVZX — Packed Move with Zero Extend Opcode Instruction Op/ En 64-bit Mode Compat/ Leg Mode Description 66 0f 38 30 /r PMOVZXBW xmm1, xmm2/m64 A Valid Valid Zero extend 8 packed 8-bit integers in the low 8 bytes of xmm2/m64 to 8 packed 16-bit integers in xmm1. 66 0f 38 31 /r PMOVZXBD xmm1, xmm2/m32 A Valid Valid Zero extend 4 packed 8-bit integers in the low 4 bytes of xmm2/m32 to 4 packed 32-bit integers in xmm1.
PAGE 140
Documentation Changes PMULHRSW — Packed Multiply High with Round and Scale Opcode Instruction 0F 38 0B /r 66 0F 38 0B /r Op/ En 64-Bit Mode Compat/ Description Leg Mode PMULHRSW mm1, A mm2/m64 Valid Valid Multiply 16-bit signed words, scale and round signed doublewords, pack high 16 bits to MM1. PMULHRSW xmm1, xmm2/m128 Valid Valid Multiply 16-bit signed words, scale and round signed doublewords, pack high 16 bits to XMM1.
PAGE 141
Documentation Changes PMULHW—Multiply Packed Signed Integers and Store High Result Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F E5 /r PMULHW mm, mm/m64 A Valid Valid Multiply the packed signed word integers in mm1 register and mm2/m64, and store the high 16 bits of the results in mm1. 66 0F E5 /r PMULHW xmm1, xmm2/m128 A Valid Valid Multiply the packed signed word integers in xmm1 and xmm2/m128, and store the high 16 bits of the results in xmm1.
PAGE 142
Documentation Changes PMULLW—Multiply Packed Signed Integers and Store Low Result Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F D5 /r PMULLW mm, mm/m64 A Valid Valid Multiply the packed signed word integers in mm1 register and mm2/m64, and store the low 16 bits of the results in mm1. 66 0F D5 /r PMULLW xmm1, xmm2/m128 A Valid Valid Multiply the packed signed word integers in xmm1 and xmm2/m128, and store the low 16 bits of the results in xmm1.
PAGE 143
Documentation Changes POP—Pop a Value from the Stack Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 8F /0 POP r/m16 A Valid Valid Pop top of stack into m16; increment stack pointer. 8F /0 POP r/m32 A N.E. Valid Pop top of stack into m32; increment stack pointer. 8F /0 POP r/m64 A Valid N.E. Pop top of stack into m64; increment stack pointer. Cannot encode 32-bit operand size. 58+ rw POP r16 B Valid Valid Pop top of stack into r16; increment stack pointer.
PAGE 144
Documentation Changes Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:r/m (w) NA NA NA B reg (w) NA NA NA C NA NA NA NA ... POPA/POPAD—Pop All General-Purpose Registers Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 61 POPA A Invalid Valid Pop DI, SI, BP, BX, DX, CX, and AX. 61 POPAD A Invalid Valid Pop EDI, ESI, EBP, EBX, EDX, ECX, and EAX.
PAGE 145
Documentation Changes POPF/POPFD/POPFQ—Pop Stack into EFLAGS Register Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 9D POPF A Valid Valid Pop top of stack into lower 16 bits of EFLAGS. 9D POPFD A N.E. Valid Pop top of stack into EFLAGS. REX.W + 9D POPFQ A Valid N.E. Pop top of stack and zeroextend into RFLAGS. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A NA NA NA NA ...
PAGE 146
Documentation Changes PREFETCHh—Prefetch Data Into Caches Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 18 /1 PREFETCHT0 m8 A Valid Valid Move data from m8 closer to the processor using T0 hint. 0F 18 /2 PREFETCHT1 m8 A Valid Valid Move data from m8 closer to the processor using T1 hint. 0F 18 /3 PREFETCHT2 m8 A Valid Valid Move data from m8 closer to the processor using T2 hint.
PAGE 147
Documentation Changes Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) NA NA ... PSHUFB — Packed Shuffle Bytes Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 38 00 /r PSHUFB mm1, mm2/m64 A Valid Valid Shuffle bytes in mm1 according to contents of mm2/m64. 66 0F 38 00 /r PSHUFB xmm1, xmm2/m128 A Valid Valid Shuffle bytes in xmm1 according to contents of xmm2/m128.
PAGE 148
Documentation Changes PSHUFHW—Shuffle Packed High Words Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode F3 0F 70 /r ib PSHUFHW xmm1, xmm2/ m128, imm8 A Valid Valid Shuffle the high words in xmm2/m128 based on the encoding in imm8 and store the result in xmm1. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (w) ModRM:r/m (r) imm8 NA ...
PAGE 149
Documentation Changes PSIGNB/PSIGNW/PSIGND — Packed SIGN Op/ En 64-Bit Mode Compat/ Leg Mode Description PSIGNB mm1, mm2/m64 A Valid Valid Negate/zero/preserve packed byte integers in mm1 depending on the corresponding sign in mm2/m64 66 0F 38 08 /r PSIGNB xmm1, xmm2/m128 A Valid Valid Negate/zero/preserve packed byte integers in xmm1 depending on the corresponding sign in xmm2/m128.
PAGE 150
Documentation Changes PSLLDQ—Shift Double Quadword Left Logical Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 66 0F 73 /7 ib PSLLDQ xmm1, imm8 A Valid Valid Shift xmm1 left by imm8 bytes while shifting in 0s. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:r/m (r, w) imm8 NA NA ...
PAGE 151
Documentation Changes Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 66 0F 73 /6 ib PSLLQ xmm1, imm8 B Valid Valid Shift quadwords in xmm1 left by imm8 while shifting in 0s. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) NA NA B ModRM:r/m (r, w) imm8 NA NA ...
PAGE 152
Documentation Changes Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) NA NA B ModRM:r/m (r, w) imm8 NA NA ... PSRLDQ—Shift Double Quadword Right Logical Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 66 0F 73 /3 ib PSRLDQ xmm1, imm8 A Valid Valid Shift xmm1 right by imm8 while shifting in 0s.
PAGE 153
Documentation Changes Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 66 0F 72 /2 ib PSRLD xmm1, imm8 B Valid Valid Shift doublewords in xmm1 right by imm8 while shifting in 0s. 0F D3 /r PSRLQ mm, mm/m64 A Valid Valid Shift mm right by amount specified in mm/m64 while shifting in 0s. 66 0F D3 /r PSRLQ xmm1, xmm2/m128 A Valid Valid Shift quadwords in xmm1 right by amount specified in xmm2/m128 while shifting in 0s.
PAGE 154
Documentation Changes Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 66 0F FA /r PSUBD xmm1, xmm2/m128 A Valid Valid Subtract packed doubleword integers in xmm2/mem128 from packed doubleword integers in xmm1. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) NA NA ...
PAGE 155
Documentation Changes Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 66 0F E9 /r PSUBSW xmm1, xmm2/m128 A Valid Valid Subtract packed signed word integers in xmm2/m128 from packed signed word integers in xmm1 and saturate results. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) NA NA ...
PAGE 156
Documentation Changes PTEST- Logical Compare Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 66 0F 38 17 /r PTEST xmm1, xmm2/m128 A Valid Valid Set ZF if xmm2/m128 AND xmm1 result is all 0s. Set CF if xmm2/m128 AND NOT xmm1 result is all 0s. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r) ModRM:r/m (r) NA NA ...
PAGE 157
Documentation Changes PUNPCKLBW/PUNPCKLWD/PUNPCKLDQ/PUNPCKLQDQ— Unpack Low Data Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 60 /r PUNPCKLBW mm, mm/m32 A Valid Valid Interleave low-order bytes from mm and mm/m32 into mm. 66 0F 60 /r PUNPCKLBW xmm1, xmm2/m128 A Valid Valid Interleave low-order bytes from xmm1 and xmm2/m128 into xmm1. 0F 61 /r PUNPCKLWD mm, mm/m32 A Valid Valid Interleave low-order words from mm and mm/m32 into mm.
PAGE 158
Documentation Changes PUSH—Push Word, Doubleword or Quadword Onto the Stack Opcode* Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode FF /6 PUSH r/m16 A Valid Valid Push r/m16. FF /6 PUSH r/m32 A N.E. Valid Push r/m32. FF /6 PUSH r/m64 A Valid N.E. Push r/m64. Default operand size 64-bits. 50+rw PUSH r16 B Valid Valid Push r16. 50+rd PUSH r32 B N.E. Valid Push r32. 50+rd PUSH r64 B Valid N.E. Push r64. Default operand size 64-bits.
PAGE 159
Documentation Changes Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:r/m (r) NA NA NA B reg (r) NA NA NA C imm8/16/32 NA NA NA D NA NA NA NA ... PUSHA/PUSHAD—Push All General-Purpose Registers Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 60 PUSHA A Invalid Valid Push AX, CX, DX, BX, original SP, BP, SI, and DI. 60 PUSHAD A Invalid Valid Push EAX, ECX, EDX, EBX, original ESP, EBP, ESI, and EDI.
PAGE 160
Documentation Changes PXOR—Logical Exclusive OR Opcode* Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F EF /r PXOR mm, mm/m64 A Valid Valid Bitwise XOR of mm/m64 and mm. 66 0F EF /r PXOR xmm1, xmm2/m128 A Valid Valid Bitwise XOR of xmm2/m128 and xmm1. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) NA NA ...
PAGE 161
Documentation Changes Opcode** Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode C1 /2 ib RCL r/m32, imm8 C Valid Valid Rotate 33 bits (CF, r/m32) left imm8 times. REX.W + C1 /2 ib RCL r/m64, imm8 C Valid N.E. Rotate 65 bits (CF, r/m64) left imm8 times. Uses a 6 bit count. D0 /3 RCR r/m8, 1 A Valid Valid Rotate 9 bits (CF, r/m8) right once. REX + D0 /3 RCR r/m8*, 1 A Valid N.E. Rotate 9 bits (CF, r/m8) right once.
PAGE 162
Documentation Changes Opcode** Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode C0 /0 ib ROL r/m8, imm8 C Valid Valid Rotate 8 bits r/m8 left imm8 times. REX + C0 /0 ib ROL r/m8*, imm8 C Valid N.E. Rotate 8 bits r/m8 left imm8 times. D1 /0 ROL r/m16, 1 A Valid Valid Rotate 16 bits r/m16 left once. D3 /0 ROL r/m16, CL B Valid Valid Rotate 16 bits r/m16 left CL times. C1 /0 ib ROL r/m16, imm8 C Valid Valid Rotate 16 bits r/m16 left imm8 times.
PAGE 163
Documentation Changes Opcode** Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode REX.W + D1 /1 ROR r/m64, 1 A Valid N.E. Rotate 64 bits r/m64 right once. Uses a 6 bit count. D3 /1 ROR r/m32, CL B Valid Valid Rotate 32 bits r/m32 right CL times. REX.W + D3 /1 ROR r/m64, CL B Valid N.E. Rotate 64 bits r/m64 right CL times. Uses a 6 bit count. C1 /1 ib ROR r/m32, imm8 C Valid Valid Rotate 32 bits r/m32 right imm8 times. REX.W + C1 /1 ib ROR r/m64, imm8 C Valid N.E.
PAGE 164
Documentation Changes RCPSS—Compute Reciprocal of Scalar Single-Precision Floating-Point Values Opcode* Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode F3 0F 53 /r RCPSS xmm1, xmm2/m32 A Valid Valid Computes the approximate reciprocal of the scalar single-precision floatingpoint value in xmm2/m32 and stores the result in xmm1. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (w) ModRM:r/m (r) NA NA ...
PAGE 165
Documentation Changes Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A NA NA NA NA Description The EAX register is loaded with the low-order 32 bits. The EDX register is loaded with the supported high-order bits of the counter. The number of high-order bits loaded into EDX is implementation specific on processors that do no support architectural performance monitoring.
PAGE 166
Documentation Changes Table 4-2 Valid General and Special Purpose Performance Counter Index Range for RDPMC (Continued) Processor Family Displayed_Family_Dis played_Model/ Other Signatures Valid PMC Index Range Generalpurpose Counters Pentium M processors 06H_09H, 06H_0DH 0, 1 0, 1 64-bit Intel Xeon processors with L3 0FH_03H, 0FH_04H) and (L3 is present) ≥ 0 and ≤ 25 ≥ 0 and ≤ 17 Intel® Core™ Solo and Intel® Core™ Duo processors, Dual-core Intel® Xeon® processor LV 06H_0EH 0, 1 0, 1 Intel®
PAGE 167
Documentation Changes The performance-monitoring counters are event counters that can be programmed to count events such as the number of instructions decoded, number of interrupts received, or number of cache loads. Appendix A, “Performance Monitoring Events,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B, lists the events that can be counted for various processors in the Intel 64 and IA-32 architecture families.
PAGE 168
Documentation Changes ELSE (* ECX is not valid or CR4.PCE is 0 and CPL is 1, 2, or 3 and CR0.PE is 1 *) #GP(0); FI; (* P6 family processors and Pentium processor with MMX technology *) IF (ECX = 0 or 1) and ((CR4.PCE = 1) or (CPL = 0) or (CR0.PE = 0)) THEN EAX ← PMC(ECX)[31:0]; EDX ← PMC(ECX)[39:32]; ELSE (* ECX is not 0 or 1 or CR4.PCE is 0 and CPL is 1, 2, or 3 and CR0.PE is 1 *) #GP(0); FI; (* Processors with CPUID family 15 *) IF ((CR4.PCE = 1) or (CPL = 0) or (CR0.
PAGE 169
Documentation Changes RDTSC—Read Time-Stamp Counter Opcode* Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 31 RDTSC A Valid Valid Read time-stamp counter into EDX:EAX. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A NA NA NA NA Description Loads the current value of the processor’s time-stamp counter (a 64-bit MSR) into the EDX:EAX registers.
PAGE 170
Documentation Changes RDTSCP—Read Time-Stamp Counter and Processor ID Opcode* Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 01 F9 RDTSCP A Valid Valid Read 64-bit time-stamp counter and 32-bit IA32_TSC_AUX value into EDX:EAX and ECX. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A NA NA NA NA ...
PAGE 171
Documentation Changes Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode F3 REX.W 6F REP OUTS DX, r/m32 A Valid N.E. Output RCX default size from [RSI] to port DX. F3 AC REP LODS AL A Valid Valid Load (E)CX bytes from DS:[(E)SI] to AL. F3 REX.W AC REP LODS AL A Valid N.E. Load RCX bytes from [RSI] to AL. F3 AD REP LODS AX A Valid Valid Load (E)CX words from DS:[(E)SI] to AX. F3 AD REP LODS EAX A Valid Valid Load (E)CX doublewords from DS:[(E)SI] to EAX.
PAGE 172
Documentation Changes Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode F2 A6 REPNE CMPS m8, m8 A Valid Valid Find matching bytes in ES:[(E)DI] and DS:[(E)SI]. F2 REX.W A6 REPNE CMPS m8, m8 A Valid N.E. Find matching bytes in [RDI] and [RSI]. F2 A7 REPNE CMPS m16, A m16 Valid Valid Find matching words in ES:[(E)DI] and DS:[(E)SI]. F2 A7 REPNE CMPS m32, A m32 Valid Valid Find matching doublewords in ES:[(E)DI] and DS:[(E)SI]. F2 REX.
PAGE 173
Documentation Changes Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A NA NA NA NA B imm16 NA NA NA ... ROUNDPD — Round Packed Double Precision Floating-Point Values Opcode* Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 66 0F 3A 09 /r ib ROUNDPD xmm1, xmm2/m128, imm8 A Valid Valid Round packed double precision floating-point values in xmm2/m128 and place the result in xmm1. The rounding mode is determined by imm8.
PAGE 174
Documentation Changes ROUNDSD — Round Scalar Double Precision Floating-Point Values Opcode* Instruction Op/ En 66 0F 3A 0B /r ib ROUNDSD xmm1, A xmm2/m64, imm8 64-Bit Mode Compat/ Description Leg Mode Valid Valid Round the low packed double precision floatingpoint value in xmm2/m64 and place the result in xmm1. The rounding mode is determined by imm8. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (w) ModRM:r/m (r) imm8 NA ...
PAGE 175
Documentation Changes RSQRTPS—Compute Reciprocals of Square Roots of Packed SinglePrecision Floating-Point Values Opcode* Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 52 /r RSQRTPS xmm1, xmm2/m128 A Valid Valid Computes the approximate reciprocals of the square roots of the packed singleprecision floating-point values in xmm2/m128 and stores the results in xmm1.
PAGE 176
Documentation Changes Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A NA NA NA NA ... SAL/SAR/SHL/SHR—Shift Opcode*** Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode D0 /4 SAL r/m8, 1 A Valid Valid REX + D0 /4 SAL r/m8**, 1 A Valid N.E. Multiply r/m8 by 2, once. D2 /4 SAL r/m8, CL B Valid Valid Multiply r/m8 by 2, CL times. REX + D2 /4 SAL r/m8**, CL B Valid N.E. Multiply r/m8 by 2, CL times.
PAGE 177
Documentation Changes Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode D1 /7 SAR r/m16,1 A Valid Valid Signed divide* r/m16 by 2, once. D3 /7 SAR r/m16, CL B Valid Valid Signed divide* r/m16 by 2, CL times. C1 /7 ib SAR r/m16, imm8 C Valid Valid Signed divide* r/m16 by 2, imm8 times. D1 /7 SAR r/m32, 1 A Valid Valid Signed divide* r/m32 by 2, once. REX.W + D1 /7 SAR r/m64, 1 A Valid N.E. Signed divide* r/m64 by 2, once.
PAGE 178
Documentation Changes Opcode*** Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode REX + D0 /5 SHR r/m8**, 1 A Valid N.E. Unsigned divide r/m8 by 2, once. D2 /5 SHR r/m8, CL B Valid Valid Unsigned divide r/m8 by 2, CL times. REX + D2 /5 SHR r/m8**, CL B Valid N.E. Unsigned divide r/m8 by 2, CL times. C0 /5 ib SHR r/m8, imm8 C Valid Valid Unsigned divide r/m8 by 2, imm8 times. REX + C0 /5 ib SHR r/m8**, imm8 C Valid N.E. Unsigned divide r/m8 by 2, imm8 times.
PAGE 179
Documentation Changes SBB—Integer Subtraction with Borrow Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 1C ib SBB AL, imm8 A Valid Valid Subtract with borrow imm8 from AL. 1D iw SBB AX, imm16 A Valid Valid Subtract with borrow imm16 from AX. 1D id SBB EAX, imm32 A Valid Valid Subtract with borrow imm32 from EAX. REX.W + 1D id SBB RAX, imm32 A Valid N.E. Subtract with borrow signextended imm.32 to 64-bits from RAX.
PAGE 180
Documentation Changes Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 1B /r SBB r32, r/m32 D Valid Valid Subtract with borrow r/m32 from r32. REX.W + 1B /r SBB r64, r/m64 D Valid N.E. Subtract with borrow r/m64 from r64. NOTES: * In 64-bit mode, r/m8 can not be encoded to access the following byte registers if a REX prefix is used: AH, BH, CH, DH.
PAGE 181
Documentation Changes NOTES: * In 64-bit mode, only 64-bit (RDI) and 32-bit (EDI) address sizes are supported. In non-64-bit mode, only 32-bit (EDI) and 16-bit (DI) address sizes are supported. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A NA NA NA NA ... SETcc—Set Byte on Condition Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 97 SETA r/m8 A Valid Valid Set byte if above (CF=0 and ZF=0). REX + 0F 97 SETA r/m8* A Valid N.E.
PAGE 182
Documentation Changes Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode REX + 0F 9E SETLE r/m8* A Valid N.E. Set byte if less or equal (ZF=1 or SF≠ OF). 0F 96 SETNA r/m8 A Valid Valid Set byte if not above (CF=1 or ZF=1). REX + 0F 96 SETNA r/m8* A Valid N.E. Set byte if not above (CF=1 or ZF=1). 0F 92 SETNAE r/m8 A Valid Valid Set byte if not above or equal (CF=1). REX + 0F 92 SETNAE r/m8* A Valid N.E. Set byte if not above or equal (CF=1).
PAGE 183
Documentation Changes Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode REX + 0F 99 SETNS r/m8* A Valid N.E. Set byte if not sign (SF=0). 0F 95 SETNZ r/m8 A Valid Valid Set byte if not zero (ZF=0). REX + 0F 95 SETNZ r/m8* A Valid N.E. Set byte if not zero (ZF=0). 0F 90 SETO r/m8 A Valid Valid Set byte if overflow (OF=1) REX + 0F 90 SETO r/m8* A Valid N.E. Set byte if overflow (OF=1). 0F 9A SETP r/m8 A Valid Valid Set byte if parity (PF=1).
PAGE 184
Documentation Changes Description Performs a serializing operation on all store-to-memory instructions that were issued prior the SFENCE instruction. This serializing operation guarantees that every store instruction that precedes the SFENCE instruction in program order becomes globally visible before any store instruction that follows the SFENCE instruction.
PAGE 185
Documentation Changes Opcode* Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F A5 SHLD r/m32, r32, CL B Valid Valid Shift r/m32 to left CL places while shifting bits from r32 in from the right. REX.W + 0F A5 SHLD r/m64, r64, CL B Valid N.E. Shift r/m64 to left CL places while shifting bits from r64 in from the right.
PAGE 186
Documentation Changes SHUFPD—Shuffle Packed Double-Precision Floating-Point Values Opcode* Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 66 0F C6 /r ib SHUFPD xmm1, xmm2/m128, imm8 A Valid Valid Shuffle packed doubleprecision floating-point values selected by imm8 from xmm1 and xmm2/m128 to xmm1. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) imm8 NA ...
PAGE 187
Documentation Changes SLDT—Store Local Descriptor Table Register Opcode* Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 00 /0 SLDT r/m16 A Valid Valid Stores segment selector from LDTR in r/m16. REX.W + 0F 00 /0 SLDT r64/m16 A Valid Valid Stores segment selector from LDTR in r64/m16. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:r/m (w) NA NA NA ...
PAGE 188
Documentation Changes Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (w) ModRM:r/m (r) NA NA ... SQRTPS—Compute Square Roots of Packed Single-Precision FloatingPoint Values Opcode* Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 51 /r SQRTPS xmm1, xmm2/m128 A Valid Valid Computes square roots of the packed single-precision floating-point values in xmm2/m128 and stores the results in xmm1.
PAGE 189
Documentation Changes SQRTSS—Compute Square Root of Scalar Single-Precision Floating-Point Value Opcode* Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode F3 0F 51 /r SQRTSS xmm1, xmm2/m32 A Valid Valid Computes square root of the low single-precision floating-point value in xmm2/m32 and stores the results in xmm1. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (w) ModRM:r/m (r) NA NA ...
PAGE 190
Documentation Changes STI—Set Interrupt Flag Opcode* Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode FB STI A Valid Valid Set interrupt flag; external, maskable interrupts enabled at the end of the next instruction. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A NA NA NA NA ...
PAGE 191
Documentation Changes Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode AA STOSB A Valid Valid For legacy mode, store AL at address ES:(E)DI; For 64-bit mode store AL at address RDI or EDI. AB STOSW A Valid Valid For legacy mode, store AX at address ES:(E)DI; For 64bit mode store AX at address RDI or EDI. AB STOSD A Valid Valid For legacy mode, store EAX at address ES:(E)DI; For 64bit mode store EAX at address RDI or EDI. REX.W + AB STOSQ A Valid N.E.
PAGE 192
Documentation Changes SUB—Subtract Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 2C ib SUB AL, imm8 A Valid Valid Subtract imm8 from AL. 2D iw SUB AX, imm16 A Valid Valid Subtract imm16 from AX. 2D id SUB EAX, imm32 A Valid Valid Subtract imm32 from EAX. REX.W + 2D id SUB RAX, imm32 A Valid N.E. Subtract imm32 signextended to 64-bits from RAX. 80 /5 ib SUB r/m8, imm8 B Valid Valid Subtract imm8 from r/m8. REX + 80 /5 ib SUB r/m8*, imm8 B Valid N.
PAGE 193
Documentation Changes Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A AL/AX/EAX/RAX imm8/26/32 NA NA B ModRM:r/m (r, w) imm8/26/32 NA NA C ModRM:r/m (r, w) ModRM:reg (r) NA NA D ModRM:reg (r, w) ModRM:r/m (r) NA NA ...
PAGE 194
Documentation Changes SUBSD—Subtract Scalar Double-Precision Floating-Point Values Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode F2 0F 5C /r SUBSD xmm1, xmm2/m64 A Valid Valid Subtracts the low doubleprecision floating-point values in xmm2/mem64 from xmm1. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) NA NA ...
PAGE 195
Documentation Changes SYSCALL—Fast System Call Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 05 SYSCALL A Valid Invalid Fast call to privilege level 0 system procedures. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A NA NA NA NA ... SYSENTER—Fast System Call Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 34 SYSENTER A Valid Valid Fast call to privilege level 0 system procedures.
PAGE 196
Documentation Changes SS.BASE ← 0; SS.LIMIT ← FFFFFH; SS.ARbyte.G ← 1; SS.ARbyte.S ←; SS.ARbyte.TYPE ← 0011B; SS.ARbyte.D ← 1; SS.ARbyte.DPL ← 0; SS.SEL.RPL ← 0; SS.ARbyte.P ← 1; (* Flat segment *) (* 4-GByte limit *) (* 4-KByte granularity *) (* Read/Write, Accessed *) (* 32-bit stack segment*) ESP ← SYSENTER_ESP_MSR; EIP ← SYSENTER_EIP_MSR; ...
PAGE 197
Documentation Changes TEST—Logical Compare Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode A8 ib TEST AL, imm8 A Valid Valid AND imm8 with AL; set SF, ZF, PF according to result. A9 iw TEST AX, imm16 A Valid Valid AND imm16 with AX; set SF, ZF, PF according to result. A9 id TEST EAX, imm32 A Valid Valid AND imm32 with EAX; set SF, ZF, PF according to result. REX.W + A9 id TEST RAX, imm32 A Valid N.E.
PAGE 198
Documentation Changes Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A AL/AX/EAX/RAX imm8/16/32 NA NA B ModRM:r/m (r) imm8/16/32 NA NA C ModRM:r/m (r) ModRM:reg (r) NA NA ...
PAGE 199
Documentation Changes UD2—Undefined Instruction Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 0B UD2 A Valid Valid Raise invalid opcode exception. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A NA NA NA NA ...
PAGE 200
Documentation Changes UNPCKLPD—Unpack and Interleave Low Packed Double-Precision Floating-Point Values Opcode Instruction Op/ En 66 0F 14 /r UNPCKLPD xmm1, A xmm2/m128 64-Bit Mode Compat/ Description Leg Mode Valid Valid Unpacks and Interleaves double-precision floatingpoint values from low quadwords of xmm1 and xmm2/m128. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:reg (r, w) ModRM:r/m (r) NA NA ...
PAGE 201
Documentation Changes Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:r/m (r) NA NA NA B NA NA NA NA ... WAIT/FWAIT—Wait Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 9B WAIT A Valid Valid Check pending unmasked floating-point exceptions. 9B FWAIT A Valid Valid Check pending unmasked floating-point exceptions.
PAGE 202
Documentation Changes WRMSR—Write to Model Specific Register Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 30 WRMSR A Valid Valid Write the value in EDX:EAX to MSR specified by ECX. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A NA NA NA NA ... XADD—Exchange and Add Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F C0 /r XADD r/m8, r8 A Valid Valid Exchange r8 and r/m8; load sum into r/m8.
PAGE 203
Documentation Changes XCHG—Exchange Register/Memory with Register Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 90+rw XCHG AX, r16 A Valid Valid Exchange r16 with AX. 90+rw XCHG r16, AX B Valid Valid Exchange AX with r16. 90+rd XCHG EAX, r32 A Valid Valid Exchange r32 with EAX. REX.W + 90+rd XCHG RAX, r64 A Valid N.E. Exchange r64 with RAX. 90+rd XCHG r32, EAX B Valid Valid Exchange EAX with r32. REX.W + 90+rd XCHG r64, RAX B Valid N.E.
PAGE 204
Documentation Changes XGETBV—Get Value of Extended Control Register Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F 01 D0 XGETBV A Valid Valid Reads an XCR specified by ECX into EDX:EAX. Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A NA NA NA NA ... XLAT/XLATB—Table Look-up Translation Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode D7 XLAT m8 A Valid Valid Set AL to memory byte DS:[(E)BX + unsigned AL].
PAGE 205
Documentation Changes XOR—Logical Exclusive OR Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 34 ib XOR AL, imm8 A Valid Valid AL XOR imm8. 35 iw XOR AX, imm16 A Valid Valid AX XOR imm16. 35 id XOR EAX, imm32 A Valid Valid EAX XOR imm32. REX.W + 35 id XOR RAX, imm32 A Valid N.E. RAX XOR imm32 (signextended). 80 /6 ib XOR r/m8, imm8 B Valid Valid r/m8 XOR imm8. REX + 80 /6 ib XOR r/m8*, imm8 B Valid N.E. r/m8 XOR imm8.
PAGE 206
Documentation Changes Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A AL/AX/EAX/RAX imm8/16/32 NA NA B ModRM:r/m (r, w) imm8/16/32 NA NA C ModRM:r/m (r, w) ModRM:reg (r) NA NA D ModRM:reg (r, w) ModRM:r/m (r) NA NA ... XORPD—Bitwise Logical XOR for Double-Precision Floating-Point Values Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 66 0F 57 /r XORPD xmm1, xmm2/m128 A Valid Valid Bitwise exclusive-OR of xmm2/m128 and xmm1.
PAGE 207
Documentation Changes XRSTOR—Restore Processor Extended States Opcode Instruction Op/ En 64-Bit Mode Compat/ Description Leg Mode 0F AE /5 XRSTOR mem A Valid Valid Restore processor extended states from memory. The states are specified by EDX:EAX Instruction Operand Encoding Op/En Operand 1 Operand 2 Operand 3 Operand 4 A ModRM:r/m (r) NA NA NA ...
PAGE 208
Documentation Changes 3. Updates to Chapter 4, Volume 3A Change bars show changes to Chapter 4 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A: System Programming Guide, Part 1. -----------------------------------------------------------------------------------------... Table 4-1 illustrates the key differences between the three paging modes. Table 4-1 Properties of Different Paging Modes Paging Mode CR0.PG CR4.
PAGE 209
Documentation Changes • PAT: page-attribute table. If CPUID.01H:EDX.PAT [bit 16] = 1, the 8-entry page-attribute table (PAT) is supported. When the PAT is supported, three bits in certain paging-structure entries select a memory type (used to determine type of caching used) from the PAT (see Section 4.9). • PSE-36: 36-Bit page size extension. If CPUID.01H:EDX.
PAGE 210
Documentation Changes said to reference the other paging structure; in the latter, the entry is said to map a page. The first paging structure used for any translation is located at the physical address in CR3. A linear address is translated using the following iterative procedure. A portion of the linear address (initially the uppermost bits) select an entry in a paging structure (initially the one located using CR3).
PAGE 211
Documentation Changes Paging structures are given different names based their uses in the translation process. Table 4-2 gives the names of the different paging structures. It also provides, for each structure, the source of the physical address used to locate it (CR3 or a different pagingstructure entry); the bits in the linear address used to select an entry from the structure; and details of about whether and how such an entry can map a page. ...
PAGE 212
Documentation Changes The page-directory-pointer-table comprises four (4) 64-bit entries called PDPTEs. Each PDPTE controls access to a 1-GByte region of the linear-address space. Corresponding to the PDPTEs, the logical processor maintains a set of four (4) internal, non-architectural PDPTE registers, called PDPTE0, PDPTE1, PDPTE2, and PDPTE3.
PAGE 213
Documentation Changes 3. Reserved fields must be 0. 4. If IA32_EFER.NXE = 0 and the P flag of a PDE or a PTE is 1, the XD flag (bit 63) is reserved. ... Table 4-8. Format of a PAE Page-Directory-Pointer-Table Entry (PDPTE) Bit Position(s) Contents 0 (P) Present; must be 1 to reference a page directory 2:1 Reserved (must be 0) 3 (PWT) Page-level write-through; indirectly determines the memory type used to access the page directory referenced by this entry (see Section 4.
PAGE 214
Documentation Changes 4.5 IA-32E PAGING A logical processor uses IA-32e paging if CR0.PG = 1, CR4.PAE = 1, and IA32_EFER.LME = 1. With IA-32e paging, linear address are translated using a hierarchy of in-memory paging structures located using the contents of CR3. IA-32e paging translates 48-bit linear addresses to 52-bit physical addresses.1 Although 52 bits corresponds to 4 PBytes, linear addresses are limited to 48 bits; at most 256 TBytes of linear-address space may be accessed at any given time.
PAGE 215
Documentation Changes Because a PDPTE is identified using bits 47:30 of the linear address, it controls access to a 1-GByte region of the linear-address space. Use of the PDPTE depends on its PS flag (bit 7):1 ... • If the PDPTE’s PS flag is 1, the PDPTE maps a 1-GByte page (see Table 4-14).
PAGE 216
Documentation Changes Table 4-14 Format of an IA-32e Page-Directory-Pointer-Table Entry (PDPTE) that Maps a 1-GByte Page (Continued) Bit Position(s) Contents 63 (XD) If IA32_EFER.NXE = 1, execute-disable (if 1, instruction fetches are not allowed from the 1-GByte page controlled by this entry; see Section 4.6); otherwise, reserved (must be 0) NOTES: 1. The PAT is supported on all processors that support IA-32e paging. — Bits 51:30 are from the PDPTE. — Bits 29:0 are from the original linear address.
PAGE 217
Documentation Changes — Bits 51:12 are from the PDPTE. — Bits 11:3 are bits 29:21 of the linear address. — Bits 2:0 are all 0 ... If a paging-structure entry’s P flag (bit 0) is 0 or if the entry sets any reserved bit, the entry is used neither to reference another paging-structure entry nor to map a page. A reference using a linear address whose translation would use such a paging-structure entry causes a page-fault exception (see Section 4.7).
PAGE 218
Documentation Changes Figure 4-11. Formats of CR3 and Paging-Structure Entries with IA-32e Paging 6666555555555 3210987654321 M1 M-1 33322222222221111111111 210987654321098765432109876543210 Reserved2 X D 3 Ignored Address of PML4 table Rsvd. Address of page-directory-pointer table Ignored Ign. Ignored X D Ignored Rsvd. X D Ignored Rsvd. Address of 1GB page frame Reserved Address of page directory Ignored Rsvd. X D Ignored Rsvd.
PAGE 219
Documentation Changes 4.7 PAGE-FAULT EXCEPTIONS Accesses using linear addresses may cause page-fault exceptions (#PF; exception 14). An access to a linear address may cause page-fault exception for either of two reasons: (1) there is no valid translation for the linear address; or (2) there is a valid translation for the linear address, but its access rights do not permit the access. As noted in Section 4.3, Section 4.4.2, and Section 4.
PAGE 220
Documentation Changes The PAT is a 64-bit MSR (IA32_PAT; MSR index 277H) comprising eight (8) 8-bit entries (entry i comprises bits 8i+7:8i of the MSR). For any access to a physical address, the table combines the memory type specified for that physical address by the MTRRs with a memory type selected from the PAT. Table 11-11 in Section 11.12.3 specifies how a memory type is selected from the PAT.
PAGE 221
Documentation Changes — If the translation does use a PTE, the page size is 4 KBytes and the page number comprises bits 47:12 of the linear address. ... 4.10.1.2 Caching Translations in TLBs The processor may accelerate the paging process by caching individual translations in translation lookaside buffers (TLBs). Each entry in a TLB is an individual translation. Each translation is referenced by a page number.
PAGE 222
Documentation Changes while the lower bits come from the linear address of the access for which the translation is created. There is no way for software to be aware that multiple translations for smaller pages have been used for a large page. If software modifies the paging structures so that the page size used for a 4-KByte range of linear addresses changes, the TLBs may subsequently contain multiple translations for the address range (one for each page size).
PAGE 223
Documentation Changes • PDPTE cache (IA-32e paging only).1 Each PDPTE-cache entry is referenced by an 18-bit value and is used for linear addresses for which bits 47:30 have that value. The entry contains information from the PML4E and PDPTE used to translate such linear addresses: — The physical address from the PDPTE (the address of the page directory). (No PDPTE-cache entry is created for a PDPTE that maps a 1-GByte page.) — The logical-AND of the R/W flags in the PML4E and the PDPTE.
PAGE 224
Documentation Changes • If the nature of the paging structures is such that a single entry may be used for multiple purposes (see Section 4.10.2.3), software should perform invalidations for all of these purposes. For example, if a single entry might serve as both a PDE and PTE, it may be necessary to execute INVLPG with two (or more) linear addresses, one that uses the entry as a PDE and one that uses it as a PTE. (Alternatively, software could use MOV to CR3 or MOV to CR4.) • As noted in Section 4.10.
PAGE 225
Documentation Changes • If a paging-structure entry is modified to change the accessed flag from 1 to 0, failure to perform an invalidation may result in the processor not setting that bit in response to a subsequent access to a linear address whose translation uses the entry. Software cannot interpret the bit being clear as an indication that such an access has not occurred.
PAGE 226
Documentation Changes In some cases, the consequences of delayed invalidation may not affect software adversely. For example, when freeing a portion of the linear-address space (by marking paging-structure entries “not present”), invalidation using INVLPG may be delayed if software does not re-allocate that portion of the linear-address space or the memory that had been associated with it.
PAGE 227
Documentation Changes 4. Updates to Chapter 5, Volume 3A Change bars show changes to Chapter 5 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A: System Programming Guide, Part 1. -----------------------------------------------------------------------------------------... 5.3 LIMIT CHECKING The limit field of a segment descriptor prevents programs or procedures from addressing memory locations outside the segment.
PAGE 228
Documentation Changes by privilege level 0 operating system or executive procedures for fast returns to privilege level 3 user code. Stack pointers for SYSCALL/SYSRET are not specified through model specific registers. The clearing of bits in RFLAGS is programmable rather than fixed. SYSCALL/SYSRET save and restore the RFLAGS register.
PAGE 229
Documentation Changes 5. Updates to Chapter 8, Volume 3A Change bars show changes to Chapter 8 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A: System Programming Guide, Part 1. -----------------------------------------------------------------------------------------... 8.1 LOCKED ATOMIC OPERATIONS The 32-bit IA-32 processors support locked atomic operations on locations in system memory.
PAGE 230
Documentation Changes • Unaligned 16-, 32-, and 64-bit accesses to cached memory that fit within a cache line Accesses to cacheable memory that are split across bus widths, cache lines, and page boundaries are not guaranteed to be atomic by the Intel Core 2 Duo, Intel Atom, Intel Core Duo, Pentium M, Pentium 4, Intel Xeon, P6 family, Pentium, and Intel486 processors.
PAGE 231
Documentation Changes Software should access semaphores (shared memory used for signalling between multiple processors) using identical addresses and operand lengths. For example, if one processor accesses a semaphore using a word access, other processors should not access the semaphore using a byte access. NOTE Do not implement semaphores using the WC memory type. Do not perform non-temporal stores to a cache line containing a location used to implement a semaphore.
PAGE 232
Documentation Changes Execute a serializing instruction; (* For example, CPUID instruction *) Execute new code; The use of one of these options is not required for programs intended to run on the Pentium or Intel486 processors, but are recommended to ensure compatibility with the P6 and more recent processor families. Self-modifying code will execute at a lower level of performance than non-self-modifying or normal code.
PAGE 233
Documentation Changes automatically prevents two or more processors that have cached the same area of memory from simultaneously modifying data in that area. ... 8.2.1 Memory Ordering in the Intel® Pentium® and Intel486™ Processors The Pentium and Intel486 processors follow the processor-ordered memory model; however, they operate as strongly-ordered processors under most circumstances.
PAGE 234
Documentation Changes • LFENCE instructions cannot pass earlier reads. • SFENCE instructions cannot pass earlier writes. • MFENCE instructions cannot pass earlier reads or writes. ... 8.2.4.2 Examples Illustrating Memory-Ordering Principles for String Operations The following examples uses the same notation and convention as described in Section 8.2.3.1.
PAGE 235
Documentation Changes • The page attribute table (PAT) can be used to strengthen memory ordering for a specific page or group of pages (see Section 11.12, “Page Attribute Table (PAT)”). The PAT is available only in the Pentium 4, Intel Xeon, and Pentium III processors. These mechanisms can be used as follows: Memory mapped devices and other I/O devices on the bus are often sensitive to the order of writes to their I/O buffers.
PAGE 236
Documentation Changes applied to an address range dedicated to memory mapped I/O devices to force strong memory ordering. • For areas of memory where weak ordering is acceptable, the write back (WB) memory type can be chosen. Here, reads can be performed speculatively and writes can be buffered and combined.
PAGE 237
Documentation Changes • Privileged serializing instructions — INVD, INVEPT, INVLPG, INVVPID, LGDT, LIDT, LLDT, LTR, MOV (to control register, with the exception of MOV CR81), MOV (to debug register), WBINVD, and WRMSR. • Non-privileged serializing instructions — CPUID, IRET, and RSM. When the processor serializes instruction execution, it ensures that all pending memory transactions are completed (including writes stored in its store buffer) before it executes the next instruction.
PAGE 238
Documentation Changes 1. Waits on the BIOS initialization Lock Semaphore. When control of the semaphore is attained, initialization continues. 2. Loads the microcode update into the processor. 3. Initializes the MTRRs (using the same mapping that was used for the BSP). 4. Enables the cache. 5. Executes the CPUID instruction with a value of 0H in the EAX register, then reads the EBX, ECX, and EDX registers to determine if the AP is “GenuineIntel.” 6.
PAGE 239
Documentation Changes 6. Updates to Chapter 10, Volume 3A Change bars show changes to Chapter 10 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A: System Programming Guide, Part 1. -----------------------------------------------------------------------------------------... 10.3 THE INTEL® 82489DX EXTERNAL APIC, THE APIC, THE XAPIC, AND THE X2APIC The local APIC in the P6 family and Pentium processors is an architectural subset of the Intel® 82489DX external APIC.
PAGE 240
Documentation Changes NOTE In processors based on Intel Microarchitecture (Nehalem) the Local APIC ID Register is no longer Read/Write; it is Read Only. Table 10-1 Local APIC Register Address Map Address Register Name Software Read/Write FEE0 0000H Reserved FEE0 0010H Reserved FEE0 0020H Local APIC ID Register Read/Write. FEE0 0030H Local APIC Version Register Read Only.
PAGE 241
Documentation Changes Table 10-1 Local APIC Register Address Map (Continued) Address Register Name Software Read/Write FEE0 01F0H Trigger Mode Register (TMR); bits 255:224 Read Only. FEE0 0200H Interrupt Request Register (IRR); bits 31:0 Read Only. FEE0 0210H Interrupt Request Register (IRR); bits 63:32 Read Only. FEE0 0220H Interrupt Request Register (IRR); bits 95:64 Read Only. FEE0 0230H Interrupt Request Register (IRR); bits 127:96 Read Only.
PAGE 242
Documentation Changes Suppress EOI-broadcasts Indicates whether software can inhibit the broadcast of EOI message by setting bit 12 of the Spurious Interrupt Vector Register; see Section 10.8.5 and Section 10.9. 31 25 24 23 Reserved 16 15 Max LVT Entry 0 8 7 Reserved Version Support for EOI-broadcast suppression Value after reset: 00BN 00VVH V = Version, N = # of LVT entries minus 1, B = 1 if EOI-broadcast suppression supported Address: FEE0 0030H Figure 10-7. Local APIC Version Register ... 10.
PAGE 243
Documentation Changes thermal monitor register and its associated interrupt were introduced in the Pentium 4 and Intel Xeon processors. As shown in Figure 10-8, some of these fields and flags are not available (and reserved) for some entries.
PAGE 244
Documentation Changes when the local APIC sets one of the error bits in the ESR. The LVT error register allows selection of the interrupt vector to be delivered to the processor core when APIC error is detected. The LVT error register also provides a means of masking an APIC error interrupt. The ESR is a write/read register. A write (of any value) to the ESR must be done to update the register before attempting to read it.
PAGE 245
Documentation Changes Table 10-2. ESR Flags FLAG Function Send Checksum Error (P6 family and Pentium processors only) Set when the local APIC detects a checksum error for a message that it sent on the APIC bus. Receive Checksum Error (P6 family and Pentium processors only) Set when the local APIC detects a checksum error for a message that it received on the APIC bus. Send Illegal Vector Set when the local APIC detects an illegal vector in the message that it is sending.
PAGE 246
Documentation Changes ... ... 10.6.1 Interrupt Command Register (ICR) The interrupt command register (ICR) is a 64-bit local APIC register (see Figure 10-12) that allows software running on the processor to specify and send interprocessor interrupts (IPIs) to other processors in the system. To send an IPI, software must set up the ICR to indicate the type of IPI message to be sent and the destination processor or processors.
PAGE 247
Documentation Changes — Destination Mode — Selects one of two destination modes (physical or logical). — Destination Field — In physical destination mode, used to specify the APIC ID of the destination processor; in logical destination mode, used to specify a message destination address (MDA) that can be used to select specific processors in clusters. — Destination Shorthand — A quick method of specifying all processors, all excluding self, or self as the destination.
PAGE 248
Documentation Changes 31 0 Address: 0FEE0 00B0H Value after reset: 0H Figure 10-21 EOI Register Upon receiving and EOI, the APIC clears the highest priority bit in the ISR and dispatches the next highest priority interrupt to the processor. If the terminated interrupt was a level-triggered interrupt, the local APIC also sends an end-of-interrupt message to all I/ O APICs.
PAGE 249
Documentation Changes priority level is established when the MOV CR8 instruction completes execution. Software does not need to force serialization after loading the TPR using MOV CR8. Use of the MOV CRn instruction requires a privilege level of 0. Programs running at privilege level greater than 0 cannot read or write the TPR. An attempt to do so causes a general-protection exception.
PAGE 250
Documentation Changes NOTE Do not program an LVT or IOAPIC RTE with a spurious vector even if you set the mask bit. A spurious vector ISR does not do an EOI. If for some reason an interrupt is generated by an LVT or RTE entry, the bit in the inservice register will be left set for the spurious vector.
PAGE 251
Documentation Changes • Uses MSR programming interface to access APIC registers in x2APIC mode instead of memory-mapped interfaces. Memory-mapped interface is supported when operating in xAPIC mode. 10.12.1 Detecting and Enabling x2APIC Mode Processor support for x2APIC mode can be detected by executing CPUID with EAX=1 and then checking ECX, bit 21 ECX. If CPUID.(EAX=1):ECX.21 is set , the processor supports the x2APIC capability and can be placed into the x2APIC mode.
PAGE 252
Documentation Changes each register is available on the page referenced by IA32_APIC_BASE[35:12] in xAPIC mode. There is a one-to-one mapping between the x2APIC MSRs and the legacy xAPIC register offsets with the following exceptions: • The Destination Format Register (DFR): The DFR, supported at offset 0E0H in x2APIC mode, is not supported in x2APIC mode. There is no MSR with address 80EH.
PAGE 253
Documentation Changes MSR Address MMIO Offset Register Name (x2APIC mode) (xAPIC mode) MSR R/W Semantics 815H 150H ISR bits 191:160 Read-only 816H 160H ISR bits 223:192 Read-only 817H 170H ISR bits 255:224 Read-only 818H 180H Trigger Mode Register (TMR); bits 31:0 Read-only 819H 190H TMR bits 63:32 Read-only 81AH 1A0H TMR bits 95:64 Read-only 81BH 1B0H TMR bits 127:96 Read-only 81CH 1C0H TMR bits 159:128 Read-only 81DH 1D0H TMR bits 191:160 Read-only 81EH 1E0H TMR bi
PAGE 254
Documentation Changes MSR Address MMIO Offset Register Name (x2APIC mode) (xAPIC mode) MSR R/W Semantics 837H 370H LVT Error register Read/write See Figure 10-8 for reserved bits. 838H 380H Initial Count register (for Timer) Read/write 839H 390H Current Count register (for Timer) Read-only 83EH 3E0H Divide Configuration Register (DCR; for Timer) Read/write See Figure 10-10 for reserved bits. 83FH Not available SELF IPI5 Write-only Comments Available only in x2APIC mode. NOTES: 1.
PAGE 255
Documentation Changes 10.12.2 x2APIC Register Availability The local APIC registers can be accessed via the MSR interface only when the local APIC has been switched to the x2APIC mode as described in Section 10.12.1. Accessing any APIC register in the MSR address range 0800H through 0BFFH via RDMSR or WRMSR when the local APIC is not in x2APIC mode causes a general-protection exception.
PAGE 256
Documentation Changes 10.12.5 x2APIC State Transitions This section provides a detailed description of the x2APIC states of a local x2APIC unit, transitions between these states as well as interactions of these states with INIT and RESET. 10.12.5.
PAGE 257
Documentation Changes enumerating topology. The presence of CPUID leaf 0BH in a processor does not guarantee support for x2APIC. If CPUID.EAX=0BH, ECX=0H:EBX returns zero and maximum input value for basic CPUID information is greater than 0BH, then CPUID.0BH leaf is not supported on that processor. The extended topology enumeration leaf is intended to assist software with enumerating processor topology on systems that requires 32-bit x2APIC IDs to address individual logical processors.
PAGE 258
Documentation Changes 10.12.9 ICR Operation in x2APIC Mode In x2APIC mode, the layout of the Interrupt Command Register is shown in Figure 10-12. The lower 32 bits of ICR in x2APIC mode is identical to the lower half of the ICR in xAPIC mode, except the Delivery Status bit is removed since it is not needed in x2APIC mode. The destination ID field is expanded to 32 bits in x2APIC mode.
PAGE 259
Documentation Changes 10.12.10 Determining IPI Destination in x2APIC Mode 10.12.10.1 Logical Destination Mode in x2APIC Mode In x2APIC mode, the Logical Destination Register (LDR) is increased to 32 bits wide. It is a read-only register to system software. This 32-bit value is referred to as “logical x2APIC ID”. System software accesses this register via the RDMSR instruction reading the MSR at address 80DH. Figure 10-30 provides the layout of the Logical Destination Register in x2APIC mode.
PAGE 260
Documentation Changes 10.12.10.2 Deriving Logical x2APIC ID from the Local x2APIC ID In x2APIC mode, the 32-bit logical x2APIC ID, which can be read from LDR, is derived from the 32-bit local x2APIC ID. Specifically, the 16-bit logical ID sub-field is derived by shifting 1 by the lowest 4 bits of the x2APIC ID, i.e. Logical ID = 1 « x2APIC ID[3:0].
PAGE 261
Documentation Changes 7. Updates to Chapter 15, Volume 3A Change bars show changes to Chapter 15 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A: System Programming Guide, Part 1. -----------------------------------------------------------------------------------------... Table 15-7 lists overwrite rules for uncorrected errors, corrected errors, and uncorrected recoverable errors.
PAGE 262
Documentation Changes 8. Updates to Chapter 21, Volume 3B Change bars show changes to Chapter 21 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B: System Programming Guide, Part 2. -----------------------------------------------------------------------------------------... 21.1 OVERVIEW A logical processor uses virtual-machine control data structures (VMCSs) while it is in VMX operation.
PAGE 263
Documentation Changes The VMPTRST instruction stores the address of the logical processor’s current VMCS into a specified memory location (it stores the value FFFFFFFF_FFFFFFFFH if there is no current VMCS). The launch state of a VMCS determines which VM-entry instruction should be used with that VMCS: the VMLAUNCH instruction requires a VMCS whose launch state is “clear”; the VMRESUME instruction requires a VMCS whose launch state is “launched”.
PAGE 264
Documentation Changes Inactive Not Current Clear VMCLEAR X Active Not Current Launched A LE R VMLAUNCH X Anything Else VMPTRLD Y VMPTRLD X VMCLEAR X C VM VMPTRLD Y VMPTRLD X Active Current Clear VMCLEAR X V VM MP CL TR EA LD R X X Active Not Current Clear Active Current Launched Figure 21-1 States of VMCS X ... 21.10 SOFTWARE USE OF THE VMCS AND RELATED STRUCTURES This section details guidelines that software should observe when using a VMCS and related structures.
PAGE 265
Documentation Changes data of an active VMCS on the processor and not in the VMCS region. The following items detail some of the hazards of accessing VMCS data using ordinary memory operations: • Any data read from a VMCS with an ordinary memory read does not reliably reflect the state of the VMCS. Results may vary from time to time or from logical processor to logical processor. • Writing to a VMCS with an ordinary memory write is not guaranteed to have a deterministic effect on the VMCS.
PAGE 266
Documentation Changes The following software usage is consistent with these limitations: • VMCLEAR should be executed for a VMCS before it is used for VM entry for the first time. • VMLAUNCH should be used for the first VM entry using a VMCS after VMCLEAR has been executed for that VMCS. • VMRESUME should be used for any subsequent VM entry using a VMCS (until the next execution of VMCLEAR for the VMCS). It is expected that, in general, VMRESUME will have lower latency than VMLAUNCH.
PAGE 267
Documentation Changes 9. Updates to Chapter 22, Volume 3B Change bars show changes to Chapter 22 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B: System Programming Guide, Part 2. -----------------------------------------------------------------------------------------... 22.1.1 Relative Priority of Faults and VM Exits The following principles describe the ordering between existing faults and VM exits: • Certain exceptions have priority over VM exits.
PAGE 268
Documentation Changes 10. Updates to Chapter 25, Volume 3B Change bars show changes to Chapter 25 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B: System Programming Guide, Part 2. -----------------------------------------------------------------------------------------... 25.2.2 EPT Translation Mechanism ... Because a PDPTE is identified using bits 47:30 of the guest-physical address, it controls access to a 1-GByte region of the guest-physical-address space.
PAGE 269
Documentation Changes — Bits 63:52 are all 0. — Bits 51:30 are from the EPT PDPTE. — Bits 29:0 are from the original guest-physical address. • If bit 7 of the EPT PDPTE is 0, a 4-KByte naturally aligned EPT page directory is located at the physical address specified in bits 51:12 of the EPT PDPTE (see Table 25-3). An EPT page-directory comprises 512 64-bit entries (PDEs). An EPT PDE is selected using the physical address defined as follows: — Bits 63:52 are all 0. — Bits 51:12 are from the EPT PDPTE.
PAGE 270
Documentation Changes 11. Updates to Chapter 27, Volume 3B Change bars show changes to Chapter 27 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B: System Programming Guide, Part 2. -----------------------------------------------------------------------------------------... 27.3 MANAGING VMCS REGIONS AND POINTERS A VMM must observe necessary procedures when working with a VMCS, the associated VMCS pointer, and the VMCS region.
PAGE 271
Documentation Changes (a) VMX Operation and VMX Transitions VM Entry VM Entry VM Entry VM Entry VMXOFF Processor Operation VM Exit VM Exit VM Exit VMXON VM Exit Legend: VMX Root Operation Outside VMX Operation VMX Non-Root Operation (b) State of VMCS and VMX Operation VMLAUNCH VMPTRLD B VMRESUME VMCLEAR B VM Exit VM Exit VMCS B VMCS A VMPTRLD A VMPTRLD A VM Exit VMLAUNCH Legend: Inactive VMCS Current VMCS (working) Active VMCS (not current) VM Exit VMRESUME VMCLEAR A Current VM
PAGE 272
Documentation Changes 12. Updates to Chapter 30, Volume 3B Change bars show changes to Chapter 30 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B: System Programming Guide, Part 2. -----------------------------------------------------------------------------------------... 30.2.3 Pre-defined Architectural Performance Events ... A processor that supports architectural performance monitoring may not support all the predefined architectural performance events (Table 30-1).
PAGE 273
Documentation Changes the IA32_PEBS_ENABLE register for the respective counter, the software must also initialize the DS_BUFFER_MANAGEMENT_AREA data structure in memory to support capturing PEBS records for precise events. ... 30.14.1 Overview of Performance Monitoring with L3/Caching Bus Controller The facility for monitoring events consists of a set of dedicated model-specific registers (MSRs).
PAGE 274
Documentation Changes 13. Updates to Appendix A, Volume 3B Change bars show changes to Appendix A of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B: System Programming Guide, Part 2. -----------------------------------------------------------------------------------------... A.
PAGE 275
Documentation Changes Table A-2 Non-Architectural Performance Events In the Processor Core for Intel Core i7 Processor and Intel Xeon Processor 5500 Series Event Num. Umask Value Event Mask Mnemonic 04H 07H SB_DRAIN.ANY 01H MEM_UNCORE_RETI Counts number of memory load RED.L3_DATA_MISS_ instructions retired where the UNKNOWN memory reference missed L3 and data source is unknown. Available only for CPUID signature 06_2EH 80H MEM_UNCORE_RETI RED.
PAGE 276
Documentation Changes Non-architectural Performance monitoring events that are located in the uncore subsystem are implementation specific between different platforms using processors based on Intel microarchitecture (Nehalem). Processors with CPUID signature of DisplayFamily_DisplayModel 06_1AH, 06_1EH, and 06_1FH support performance events listed in Table A-3.
PAGE 277
Documentation Changes Non-Architectural Performance Events In Next Generation Processor Core (Codenamed Westmere) (Continued) B1H 3FH UOPS_EXECUTED.CO Counts number of cycles there are RE_ACTIVE_CYCLES one or more uops being executed on any ports. This is a core count only and can not be collected per thread. 01H OFF_CORE_RESPONS see Section 30.6.1.3, “Off-core E_0 Response Performance Monitoring in the Processor Core” 01H THREAD_ACTIVE ... B7H Requires programming MSR 01A6H ...
PAGE 278
Documentation Changes 0CH 04H UNC_GQ_SNOOP.GOT Counts the number of remote snoops Requires O_S_HIT_M that have requested a cache line be writing MSR set to the S state from M state. 301H with mask = 1H 0CH 04H UNC_GQ_SNOOP.GOT Counts the number of remote snoops Requires O_S_HIT_S that have requested a cache line be writing MSR set to the S state from S state. 301H with mask = 4H 0CH 08H UNC_GQ_SNOOP.
PAGE 279
Documentation Changes 33H 07H UNC_QHL_FRC_ACK_ Counts number of Force Acknowledge CNFLTS.ANY Conflict messages sent by the Quickpath Home Logic. 34H 01H UNC_QHL_SLEEPS.IO Counts number of occurrences a H_ORDER request was put to sleep due to IOH ordering (write after read) conflicts. While in the sleep state, the request is not eligible to be scheduled to the QMC 34H 02H UNC_QHL_SLEEPS.
PAGE 280
Documentation Changes 35H 02H UNC_ADDR_OPCODE Counts number of requests from the _MATCH.REMOTE remote socket, address/opcode of request is qualified by mask value written to MSR 396H. The following mask values are supported: 0: NONE 40000000_00000000H:RSPFWDI Match opcode/ address by writing MSR 396H with mask supported mask value 40001A00_00000000H:RSPFWDS 40001D00_00000000H:RSPIWB 35H 04H UNC_ADDR_OPCODE Counts number of requests from the _MATCH.
PAGE 281
Documentation Changes 81H 02H UNC_THERMAL_THR Cycles that the PCU records that core OTTLED_TEMP.CORE 1 is in the power throttled state due _1 to core’s temperature being above the thermal throttling threshold. 81H 04H UNC_THERMAL_THR Cycles that the PCU records that core OTTLED_TEMP.CORE 2 is in the power throttled state due _2 to core’s temperature being above the thermal throttling threshold. 81H 08H UNC_THERMAL_THR Cycles that the PCU records that core OTTLED_TEMP.
PAGE 282
Documentation Changes 86H 01H UNC_CYCLES_UNHAL Uncore cycles that at least one core is TED_L3_FLL_DISABL unhalted and all L3 ways are disabled. E ... ... Table A-7 Fixed-Function Performance Counter and Pre-defined Performance Events Fixed-Function Performance Counter Address Event Mask Mnemonic Description MSR_PERF_FIXED_ 309H CTR0/ IA32_PERF_FIXED_CT R0 Inst_Retired.Any This event counts the number of instructions that retire execution.
PAGE 283
Documentation Changes 14. Updates to Appendix B, Volume 3B Change bars show changes to Appendix B of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B: System Programming Guide, Part 2. -----------------------------------------------------------------------------------------... Table B-1.
PAGE 284
Documentation Changes Table B-2. IA-32 Architectural MSRs Register Address Hex Decimal Architectural MSR Name and bit fields (Former MSR Name) MSR/Bit Description Introduced as Architectural MSR ...
PAGE 285
Documentation Changes Register Address Hex Decimal Architectural MSR Name and bit fields (Former MSR Name) MSR/Bit Description Introduced as Architectural MSR 417H 1047 IA32_MC5_MISC MC5_MISC 06_0FH 418H 1048 IA32_MC6_CTL MC6_CTL 06_1DH 419H 1049 IA32_MC6_STATUS MC6_STATUS 06_1DH 1 41AH 1050 IA32_MC6_ADDR MC6_ADDR 06_1DH 41BH 1051 IA32_MC6_MISC MC6_MISC 06_1DH 41CH 1052 IA32_MC7_CTL MC7_CTL 06_1AH 41DH 1053 IA32_MC7_STATUS MC7_STATUS 06_1AH 41EH 1054 IA32_MC7_ADDR
PAGE 286
Documentation Changes Register Address Hex Decimal Architectural MSR Name and bit fields (Former MSR Name) 1 MSR/Bit Description Introduced as Architectural MSR 43AH 1082 IA32_MC14_ADDR MC14_ADDR 06_2EH 43BH 1083 IA32_MC14_MISC MC14_MISC 06_2EH 43CH 1084 IA32_MC15_CTL MC15_CTL 06_2EH 43DH 1085 IA32_MC15_STATUS MC15_STATUS 06_2EH 43EH 1086 IA32_MC15_ADDR1 MC15_ADDR 06_2EH 43FH 1087 IA32_MC15_MISC MC15_MISC 06_2EH 440H 1088 IA32_MC16_CTL MC16_CTL 06_2EH 441H 1089 IA3
PAGE 287
Documentation Changes Table B-5 MSRs in Processors Based on Intel Microarchitecture (Continued)(Nehalem) Register Address Hex Register Name Scope Bit Description Dec ... 1C8H 456 MSR_LBR_SELECT Core Last Branch Record Filtering Select Register (R/W) see Section 16.6.2, “Filtering of Last Branch Records.” 3B0H 960 MSR_UNCORE_PM C0 Package See Section 30.6.2.2, “Uncore Performance Event Configuration Facility.” 3B1H 961 MSR_UNCORE_PM C1 Package See Section 30.6.2.
PAGE 288
Documentation Changes Register Address Hex Dec 407H 1031 Register Name MSR_MC1_MISC Scope Package Bit Description See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” ... 40BH 1035 MSR_MC2_MISC Core See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 40CH 1036 MSR_MC3_CTL Core See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 40DH 1037 MSR_MC3_ STATUS Core See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” 40EH 1038 MSR_MC3_ADDR Core See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.
PAGE 289
Documentation Changes Register Address Register Name Scope Bit Description Hex Dec 41EH 1054 MSR_MC7_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 41FH 1055 MSR_MC7_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 420H 1056 MSR_MC8_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 421H 1057 MSR_MC8_ STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” and Appendix E. 422H 1058 MSR_MC8_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.
PAGE 290
Documentation Changes Register Address Register Name Scope Bit Description Hex Dec 43CH 1084 MSR_MC15_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.” 43DH 1085 MSR_MC15_ STATUS Package See Section 15.3.2.2, “IA32_MCi_STATUS MSRS.” and Appendix E. 43EH 1086 MSR_MC15_ADDR Package See Section 15.3.2.3, “IA32_MCi_ADDR MSRs.” 43FH 1087 MSR_MC15_MISC Package See Section 15.3.2.4, “IA32_MCi_MISC MSRs.” 440H 1088 MSR_MC16_CTL Package See Section 15.3.2.1, “IA32_MCi_CTL MSRs.
PAGE 291
Documentation Changes B-5 MSRS IN THE NEXT GENERATION INTEL PROCESSOR (CODENAMED WESMERE) Next Generation Intel 64 processors (codenamed Wesmere) supports the MSR interfaces listed in Table B-5, plus additional MSR listed in Table B-6.
PAGE 292
Documentation Changes 15. Updates to Appendix G, Volume 3B Change bars show changes to Appendix G of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B: System Programming Guide, Part 2. -----------------------------------------------------------------------------------------... G.