User`s guide

Cray XMT™ Programming Environment User’s Guide

Subnormal numbers are less precise than normalized numbers. The smallest

subnormal number, min_denorm, has only one significant bit while the largest has

52 significant bits. However, whenever 0.5 <= x/y <= 2.0, the difference x-

y is exact, even though it may have less precision than x and y. This is not true for

machines that flush underflow to zero.

The Cray XMT floating-point hardware handles gradual underflow transparently.

Unlike many systems, the Cray XMT is not slowed by the presence (or possibility) of

subnormal numbers and gradual underflow in a computation.

3.4.1 Differences from IEEE Floating-point Arithmetic

The Cray XMT processors do not have 32-bit floating-point instructions. If you

are performing an operation on 32-bit floating-point numbers, you must first use

the MTA_FLOAT_REAL intrinsic function to convert each 32-bit number in the

operation to a 64-bit number. After the operation is complete, you can use the

MTA_REAL_FLOAT intrinsic function to round the results to 32-bit numbers. This

double rounding (first to 64 bits and then to 32 bits) is not the same as a single

rounding to 32 bits. For more information about how to use MTA_FLOAT_REAL and

MTA_REAL_FLOAT, see the mta_intrinsics(3) man page.

The Cray XMT does not provide you with control over rounding precision for

floating-point operations. The level of rounding precision is set on the processor

during the manufacturing process.

Traps on the Cray XMT are precise, but operands can be overwritten by the results of

an operation performed on the same or a different functional unit. This can make the

implementation of post-substitution difficult.

There is no exponent wrapping when an operation enables or takes an overflow or

underflow trap. The intent of wrapping is to provide for automatic rescaling when

products or quotients are used in subsequent operations. On the Cray XMT, you

must use care when rescaling.

The hardware supports fused multiply-add operations that only require a single issue

of an instruction. This operation facilitates certain computations by making it easy to

extract the lower half of the product of two 64-bit doubles. The problem is that the

compiler can evaluate statements such as the following in several different ways, each

of which may produce a different result:

x = a*b + c*d;

The previous statement can be evaluated as either:

temp = a*b;

x = temp + c*d; // For multiply-add operation

temp = c*d;

x = a*b + temp; // For multiply-add operation

28 S–2479–20