HP-UX Floating-Point Guide

42 Chapter 2

Floating-Point Principles and the IEEE Standard for Binary Floating-Point Arithmetic

Floating-Point Formats

A denormalized value is represented by a zero exponent ﬁeld and a

nonzero fraction (if the fraction were also zero, the ﬂoating-point value

would be zero). You can compute the value of a denormalized number by

interpreting the fraction as an integer and then multiplying this integer

by 2

−149

for single-precision numbers, by 2

−1074

for double-precision

numbers, and by 2

−16494

for quad-precision numbers. The maximum

fraction is always 2

− 1, where k is the number of bits in the fraction.

(Alternatively, you can compute the value by regarding the implicit bit as

0 and the exponent as 1 minus the bias.)

The purpose of denormalized values is to allow the spaces between zero

and the smallest magnitude normalized values to be divided up, so that

as values become smaller they underﬂow with a gradually increasing

loss of accuracy.

In the range of representable values, normalized values ﬂow smoothly

into denormalized values, but there is an increasing loss of accuracy as

denormalized values become smaller and smaller. Table 2-2 shows the

range of positive denormalized values. (The hexadecimal representation

of the equivalent negative values begins with the digit 8; for example, the

minimum negative denormalized value in single-precision is 8000 0001.)

When used as operands, denormalized values are treated exactly like

normalized values in most instances. When a denormalized value is the

result of an arithmetic operation, however, an underﬂow exception

condition may occur. See “Underﬂow Conditions” on page 57 for more

information about underﬂow exceptions. Also, you should be aware that

denormalized values can signiﬁcantly degrade performance. This issue is

addressed in “Denormalized Operands” on page 182.