User Guide

Overview of the AMD64 Architecture 5

24592—Rev. 3.15—November 2009 AMD64 Technology

The 128-bit and 64-bit media instructions are designed to accelerate these applications. The

instructions use a form of vector (or packed) parallel processing known as single-instruction, multiple

data (SIMD) processing. This vector technology has the following characteristics:

• A single register can hold multiple independent pieces of data. For example, a single 128-bit XMM

elements.

• The vector instructions can operate on all data elements in a register, independently and

simultaneously. For example, a PADDB instruction operating on byte elements of two vector

operands in 128-bit XMM registers performs 16 simultaneous additions and returns 16

independent results in a single operation.

128-bit and 64-bit media instructions take SIMD vector technology a step further by including special

instructions that perform operations commonly found in media applications. For example, a graphics

application that adds the brightness values of two pixels must prevent the add operation from wrapping

around to a small value if the result overflows the destination register, because an overflow result can

produce unexpected effects such as a dark pixel where a bright one is expected. The 128-bit and 64-bit

media instructions include saturating-arithmetic instructions to simplify this type of operation. A

result that otherwise would wrap around due to overflow or underflow is instead forced to saturate at

the largest or smallest value that can be represented in the destination register.

1.1.5 Floating-Point Instructions

The AMD64 architecture provides three floating-point instruction subsets, using three distinct register

sets:

• 128-Bit Media Instructions support 32-bit single-precision and 64-bit double-precision floating-

point operations, in addition to integer operations. Operations on both vector data and scalar data

are supported, with a dedicated floating-point exception-reporting mechanism. These floating-

point operations comply with the IEEE-754 standard.

• 64-Bit Media Instructions (the subset of 3DNow! technology instructions) support single-

precision floating-point operations. Operations on both vector data and scalar data are supported,

but these instructions do not support floating-point exception reporting.

• x87 Floating-Point Instructions support single-precision, double-precision, and 80-bit extended-

precision floating-point operations. Only scalar data are supported, with a dedicated floating-point

exception-reporting mechanism. The x87 floating-point instructions contain special instructions

for performing trigonometric and logarithmic transcendental operations. The single-precision and

double-precision floating-point operations comply with the IEEE-754 standard.

Maximum floating-point performance can be achieved using the 128-bit media instructions. One of

these vector instructions can support up to four single-precision (or two double-precision) operations

in parallel. In 64-bit mode, the AMD64 architecture doubles the number of legacy XMM registers

from 8 to 16.

Applications gain additional benefits using the 64-bit media and x87 instructions. The separate register

sets supported by these instructions relieve pressure on the XMM registers available to the 128-bit