HP-UX Floating-Point Guide

Chapter 2 43
Floating-Point Principles and the IEEE Standard for Binary Floating-Point Arithmetic
Floating-Point Formats
Table 2-2 Minimum and Maximum Positive Denormalized Values
Infinity
Values that are larger in magnitude than the maximum-magnitude
normalized values are approximated by special bit patterns that
represent positive and negative infinity.
According to the IEEE standard, infinities are represented by setting all
the bits in the exponent field to 1 (value 255 for single-precision, 2047 for
double-precision, 32767 for quad-precision) and setting the fraction bits
to 0. There are actually two infinity values, negative infinity if the sign
bit is 1 and positive infinity if the sign bit is 0.
The IEEE standard defines the properties of infinities. For example, it
defines what happens when you add a number to an infinity or subtract
one infinity from another. Table 2-3 shows some of these properties. The
term finite value in the table refers to any floating-point value other
than infinity or NaN (see “Not-a-Number (NaN)” on page 45 for
information about NaN values). For the multiplication and division
operators, the sign of the result is determined by the usual arithmetic
rules.
Precision Values
Hexadecimal
Representation
Value
Single Minimum denormalized
Maximum denormalized
Minimum normalized
0000 0001
007F FFFF
0080 0000
2
149
2
149
* (2
23
1)
2
126
Double Minimum denormalized
Maximum denormalized
Minimum normalized
0000 0000 0000 0001
000F FFFF FFFF FFFF
0010 0000 0000 0000
2
1074
2
1074
* (2
52
1)
2
1022
Quad Minimum denormalized
Maximum denormalized
Minimum normalized
(24 zeros)…0000 0001
0000 FFFF…(24 more F’s)
0001 0000…(24 more zeros)
2
16494
2
16494
* (2
112
1)
2
16382