HP-UX Floating-Point Guide

22 Chapter 1

Introduction

Overview of Floating-Point Principles

In the context of computer programming, the term ﬂoating-point refers

to the ways in which modern computer systems represent real numbers

and perform real arithmetic. Computers use special representations for

ﬂoating-point numbers. They also have special rules for performing

ﬂoating-point arithmetic that differ from the rules for performing integer

arithmetic. Usually, a computer has special hardware for performing

ﬂoating-point calculations at a higher speed than would be possible

using the computer’s integer-oriented hardware.

In all modern computer systems, representations of real numbers are

inherently inexact. There are an inﬁnite number of real numbers, and a

digital computer can represent only a ﬁnite subset of them.

When you write a program that attempts to generate an unrepresentable

value, the computer approximates the value by choosing a representable

value close to the one you attempted to generate. Data that is input into

a computer in ﬂoating-point format is almost always approximate, and

the calculations performed by the computer are usually approximations

of the intended mathematical operations; therefore, the results you

receive from a mathematical computation are also usually

approximations.

The approximate nature of ﬂoating-point arithmetic has several

important ramiﬁcations:

• Results from ﬂoating-point calculations are almost never exactly

equal to the corresponding mathematical value.

• Results from a particular calculation may vary slightly from one

computer system to another, and all may be valid. However, when the

computer systems conform to the same standard, the amount of

variation is drastically reduced.

• Incorrect results are not necessarily caused by programming errors in

the traditional sense. Correcting the problems may require an

understanding of the ﬂoating-point approximation techniques used

by the computer system executing the program.

The types of incorrect results and unexpected errors that ﬂoating-point

applications sometimes generate can be very difﬁcult to interpret if you

do not understand how your computer performs ﬂoating-point