HP-UX Floating-Point Guide
40 Chapter 2
Floating-Point Principles and the IEEE Standard for Binary Floating-Point Arithmetic
Floating-Point Formats
In our example, this would be
(-1)
0
*1.5*2
2
= 1.5 * 4.0 = 6.0
Table 2-1 shows some additional examples.
Table 2-1 IEEE Representations of Floating-Point Values
Floating-Point Formats and the Limits of IEEE
Representation
Because floating-point numbers have a finite number of bits in the
fraction, only a finite subset of the continuum of real numbers can be
represented exactly in IEEE format. The unit of granularity of the
representable numbers is the ULP (Unit in the Last Place). ULPs
measure the distance between two numbers in terms of their
representation in binary. One ULP is the distance from one value to the
next representable value in the direction away from 0.
One ULP is about 1 part in 17 million for single-precision values, 1 part
in 10
16
for double-precision values, and 1 part in 10
34
for quad-precision
values. For this reason, there is a general rule of thumb that
single-precision arithmetic represents about 9 decimal places,
double-precision about 17, and quad-precision about 36. If you try to read
or write a value with a greater number of decimal digits, the last digits
will probably not contain useful information.
Hexadecimal
Representation
Sign Exponent Fraction Value
SP: 40C0 0000
DP: 4018 0000 0000 0000
QP: 4001 8000 0000 0000
0000 0000 0000 0000
+ 129 – 127 = 2
1025 – 1023 = 2
16385 – 16383 = 2
1.0 + 0.5 = 1.5
+1.5 * 2
2
= 6.0
SP: BF00 0000
DP: BFE0 0000 0000 0000
QP: BFFE 0000 0000 0000
0000 0000 0000 0000
– 126 – 127 = –1
1022 – 1023 = –1
16382 – 16383 = –1
1.0 + 0.0 = 1.0
–1.0 * 2
–1
= –0.5
SP: 7F00 0001
DP: 7FE0 0000 0000 0001
QP: 7FFE 0000 0000 0000
0000 0000 0000 0001
+ 254 – 127 = 127
2046 – 1023 = 1023
32766 – 16383 = 16383
1.0 + 2
–23
1.0 + 2
–52
1.0 + 2
–112
+1.00000019209 * 2
127
+1.000…001 (51 zeros) * 2
1023
+1.000…001 (111 zeros) *
2
16383