User Guide

ManualsBrandsAMD ManualsOtherAMD64 ARCHITECTURE

151

152

153

154

155

156

157

158

159

160

128 128-Bit Media and Scientific Programming

AMD64 Technology 24592—Rev. 3.15—November 2009

• Infinity

• Not a Number (NaN)

In common engineering and scientific usage, floating-point numbers—also called real numbers—are

represented in base (radix) 10. A non-zero number consists of a sign, a normalized significand, and a

signed exponent, as in:

+2.71828 e0

Both large and small numbers are representable in this notation, subject to the limits of data-type

precision. For example, a million in base-10 notation appears as +1.00000 e6 and -0.0000383 is

represented as -3.83000 e-5. A non-zero number can always be written in normalized form—that is,

with a leading non-zero digit immediately before the decimal point. Thus, a normalized significand in

base-10 notation is a number in the range [1,10). The signed exponent specifies the number of

positions that the decimal point is shifted.

Unlike the common engineering and scientific usage described above, 128-bit media floating-point

numbers are represented in base (radix) 2. Like its base-10 counterpart, a normalized base-2

significand is written with its leading non-zero digit immediately to the left of the radix point. In base-

2 arithmetic, a non-zero digit is always a one, so the range of a binary significand is [1,2):

+1.fraction ±exponent

The leading non-zero digit is called the integer bit. As shown in Figure 4-15 on page 126, the integer

bit is omitted (and called the hidden integer bit) in the single-precision and the double-precision

floating-point formats, because its implied value is always 1 in a normalized significand (0 in a

denormalized significand), and the omission allows an extra bit of precision.

The following sections describe the number representations.

Normalized Numbers. Normalized floating-point numbers are the most frequent operands for 128-

bit media instructions. These are finite, non-zero, positive or negative numbers in which the integer bit

is 1, the biased exponent is non-zero and non-maximum, and the fraction is any representable value.

Thus, the significand is within the range of [1, 2). Whenever possible, the processor represents a

floating-point result as a normalized number.

Denormalized (Tiny) Numbers. Denormalized numbers (also called tiny numbers) are smaller than

the smallest representable normalized numbers. They arise through an underflow condition, when the

exponent of a result lies below the representable minimum exponent. These are finite, non-zero,

positive or negative numbers in which the integer bit is 0, the biased exponent is 0, and the fraction is

non-zero.

The processor generates a denormalized-operand exception (DE) when an instruction uses a

denormalized source operand. The processor may generate an underflow exception (UE) when an

instruction produces a rounded, non-zero result that is too small to be represented as a normalized

floating-point number in the destination format, and thus is represented as a denormalized number. If a

result, after rounding, is too small to be represented as the minimum denormalized number, it is

represented as zero. (See “Exceptions” on page 177 for specific details.)