User Guide

ManualsBrandsAMD ManualsOtherAMD64 ARCHITECTURE

151

152

153

154

155

156

157

158

159

160

128-Bit Media and Scientific Programming 127

24592—Rev. 3.15—November 2009 AMD64 Technology

• Single-Precision Format—This format includes a 1-bit sign, an 8-bit biased exponent whose value

is 127, and a 23-bit significand. The integer bit is implied, making a total of 24 bits in the

significand.

• Double-Precision Format—This format includes a 1-bit sign, an 11-bit biased exponent whose

value is 1023, and a 52-bit significand. The integer bit is implied, making a total of 53 bits in the

significand.

Table 4-3 on page 127 shows the range of finite values representable by the two floating-point data

types.

For example, in the single-precision format, the largest normal number representable has an exponent

of FEh and a significand of 7FFFFFh, with a numerical value of 2

127

*(2–2

–23

). Results that overflow

above the maximum representable value return either the maximum representable normalized number

(see “Normalized Numbers” on page 128) or infinity, with the sign of the true result, depending on the

rounding mode specified in the rounding control (RC) field of the MXCSR register. Results that

underflow below the minimum representable value return either the minimum representable

normalized number or a denormalized number (see “Denormalized (Tiny) Numbers” on page 128),

with the sign of the true result, or a result determined by the SIMD floating-point exception handler,

depending on the rounding mode and the underflow-exception mask (UM) in the MXCSR register (see

“Unmasked Responses” on page 187).

Compatibility with x87 Floating-Point Data Types. The results produced by 128-bit media

floating-point instructions comply fully with the IEEE Standard for Binary Floating-Point Arithmetic

(ANSI/IEEE Std 754), because these instructions represent data in the single-precision or double-

precision data types throughout their operations. The x87 floating-point instructions, however, by

default perform operations in the double-extended-precision format. Because of this, x87 instructions

operating on the same source operands as 128-bit media floating-point instructions may return results

that are slightly different in their least-significant bits.

4.4.7 Floating-Point Number Representation

A 128-bit media floating-point value can be one of five types, as follows:

• Normal

• Denormal (Tiny)

• Zero

Table 4-3. Range of Values in Normalized Floating-Point Data Types

Data Type

Range of Normalized

Values

Base 2 (exact) Base 10 (approximate)

Single Precision 2

–126

to 2

127

*(2–2

–23

) 1.17 * 10

–38

to +3.40 * 10

Double Precision 2

–1022

to 2

1023

*(2–2

–52

) 2.23 * 10

–308

to +1.79 * 10

308

Note:

1. See “Floating-Point Number Representation” on page 127 for a definition of “normalized”.