Technical data

Cray Standard C/C++ Reference Manual

10.1.2.6.1 Cray floating-point Representation

Types float and double represent Cray single-precision (64-bit) floating-point

values; long double represents Cray double-precision (128-bit) floating-point

values.

An integral number that is converted to a floating-point number that cannot

exactly represent the original value is truncated toward 0. A floating-point

number that is converted to a narrower floating-point number is also truncated

toward 0.

Floating-point arithmetic depends on implementation-defined ranges for types of

data. The values of the minimums and maximums for these ranges are defined

by macros in the standard header file float.h. All floating-point operations on

operands that are within the defined range yield results that are also in this range

if the true mathematical result is in the range. The results are accurate to within

the ability of the hardware to represent the true value.

The maximum positive value for types float, double, and long double is

approximately as follows:

2.7 × 10

2456

Several math functions return this upper limit if the true value equals or exceeds

it.

The minimum positive value for types float, double, and long double is

approximately as follows:

3.67 × 10

-2466

These numbers define a range that is slightly smaller than the value that can be

represented on a UNICOS or UNICOS/mk system, but use of numbers outside

this range may not yield predictable results. For exact values, use the values

defined in the header file, float.h.

A floating-point value, when rounded off, can be accurately represented

to approximately 14 decimal places for types float and double, and to

approximately 28 decimal places for type long double as determined by the

following equation:

(10.1)

146 S–2179–36