User manual
MPLAB
®
XC8 C Compiler User’s Guide
DS52053B-page 146 2012 Microchip Technology Inc.
5.4.3 Floating-Point Data Types
The MPLAB XC8 compiler supports 24- and 32-bit floating-point types. Floating point
is implemented using either a IEEE 754 32-bit format, or a modified (truncated) 24-bit
form of this. Table 5-3 shows the data types and their corresponding size and arithmetic
type.
For both float and double values, the 24-bit format is the default. The options
--FLOAT=24 and --DOUBLE=24 can also be used to specify this explicitly. The 32-bit
format is used for double values if the --DOUBLE=32 option is used and for float
values if --FLOAT=32 is used.
Variables may be declared using the float and double keywords, respectively, to
hold values of these types. Floating-point types are always signed and the unsigned
keyword is illegal when specifying a floating-point type. Types declared as long dou-
ble will use the same format as types declared as double. All floating-point values
are represented in little endian format with the LSb at the lower address.
This format is described in Table 5-4, where:
• Sign is the sign bit which indicates if the number is positive or negative
• The exponent is 8 bits which is stored as excess 127 (i.e., an exponent of 0 is
stored as 127).
• Mantissa is the mantissa, which is to the right of the radix point. There is an
implied bit to the left of the radix point which is always 1 except for a zero value,
where the implied bit is zero. A zero value is indicated by a zero exponent.
The value of this number is (-1)
sign
x 2
(exponent-127)
x 1. mantissa.
TABLE 5-3: FLOATING-POINT DATA TYPES
Type Size (bits) Arithmetic Type
float 24 or 32 Real
double 24 or 32 Real
long double same as double Real
TABLE 5-4: FLOATING-POINT FORMATS
Format Sign Biased exponent Mantissa
IEEE 754 32-bit x xxxx xxxx xxx xxxx xxxx xxxx xxxx xxxx
modified IEEE 754
24-bit
x xxxx xxxx xxx xxxx xxxx xxxx