Technical data

Implementation-defined Behavior [10]
Digits beyond these precisions may not be accurate. It is safest to assume only
14 or 28 decimal places of accuracy.
Epsilon, the difference between 1.0 and the smallest value greater than 1.0 that is
representable in the given floating-point type, is approximately 7.1 × 10
-15
for
types float and double, and approximately 2.5 × 10
-29
for type long double.
10.1.2.6.2 IEEE Floating-point Representation
On UNICOS/mk systems, float represents IEEE single-precision (32-bit)
floating-point values; double and long double represent double-precision
(64-bit) floating-point values. IEEE extended double precision (128bit) is not
available on UNICOS/mk systems.
On UNICOS systems with IEEE floating-point hardware, float and
double represent IEEE double-precision (64-bit) floating-point values. The
long double representsIEEE extended double-precision (128-bit) floating-point
values. IEEE single-precision (32-bit) is not available on UNICOS systems.
An integral number that is converted to a floating-point number that cannot
exactly represent the original value is rounded according to the current rounding
mode. A floating-point number that is converted to a floating-point number with
fewer significant digits also is rounded according to the current rounding mode
on UNICOS/mk systems; on UNICOS systems, the number is rounded to closest,
but not in an IEEE round-to-nearest fashion.
Floating-point arithmetic depends on implementation-defined ranges for types of
data. The values of the minimums and maximums for these ranges are defined
by macros in the standard header file, float.h. All floating-point operations on
operands that are within the defined range yield results that are also in this range
if the true mathematical result is in the range. The results are accurate to within
the ability of the hardware to represent the true value.
The maximum positive values are approximately as follows:
3.4 × 10
38
Single (32 bits)
1.8 × 10
308
Double (64 bits)
1.2 × 10
4932
Extended double (128 bits)
The minimum positive values are approximately as follows:
1.8 × 10
38
Single (32 bits)
2.2 × 10
308
Double (64 bits)
3.4 × 10
4932
Extended double (128 bits)
S217936 147