HP-UX Floating-Point Guide

64 Chapter 2

Floating-Point Principles and the IEEE Standard for Binary Floating-Point Arithmetic

Floating-Point Operations

Inﬁnity To the comparison operators, inﬁnity is just another

signed numeric value whose magnitude is greater than

the largest normalized magnitude. Inﬁnities with the

same sign compare as equal to each other.

NaN A NaN compares as unequal to all other operands,

including other NaNs and itself. The rules above are

used to evaluate assertions involving NaNs as TRUE

or FALSE. If the assertion is non-aware, an invalid

operation exception is also signaled for any comparison

involving a <, <=, >, or >= assertion.

Conversion Between Operand Formats

The standard requires that it be possible to convert between decimal and

binary ﬂoating-point, and between binary ﬂoating-point and integer

formats. This section describes some of the properties of various

conversions. The operand type integer refers to either signed or

unsigned integers.

Single-Precision to

Double-Precision or

Quad-Precision These conversions can never overﬂow,

underﬂow, or be inexact. The only

possible type of exception is an

invalid operation if the operand is an

SNaN.

Double-Precision to

Quad-Precision These conversions can never overﬂow,

underﬂow, or be inexact. The only

possible type of exception is an

invalid operation if the operand is an

SNaN.

Quad-Precision or

Double-Precision to

Single-Precision These conversions can overﬂow or

underﬂow and are usually inexact.

Quad-Precision to

Double-Precision These conversions can overﬂow or

underﬂow and are usually inexact.