Datasheet

ManualsBrandsMicrochip ManualsEmbedded Processors & ControllersMicrocontroller 32 Bit TQFP-64

PIC32MZ Embedded Connectivity with Floating Point Unit (EF) Family

DS60001320B-page 48 Preliminary  2015 Microchip Technology Inc.

3.1.4 FLOATING POINT UNIT (FPU)

The Floating Point Unit (FPU), Coprocessor (CP1),

implements the MIPS Instruction Set Architecture for

floating point computation. The implementation sup

ports the ANSI/IEEE Standard 754 (IEEE for Binary

Floating Point Arithmetic) for single- and double-preci

sion data formats. The FPU can be programmed to

have thirty-two 32-bit or 64-bit floating point registers

used for floating point operations.

The performance is optimized for single precision for-

mats. Most instructions have one FPU cycle throughput

and four FPU cycle latency. The FPU implements the

multiply-add (MADD) and multiply-sub (MSUB) instruc-

tions with intermediate rounding after the multiply func-

tion. The result is guaranteed to be the same as

executing a MUL and an ADD instruction separately,

but the instruction latency, instruction fetch, dispatch

bandwidth, and the total number of register accesses

are improved.

IEEE denormalized input operands and results are

supported by hardware for some instructions. IEEE

denormalized results are not supported by hardware in

general, but a fast flush-to-zero mode is provided to

optimize performance. The fast flush-to-zero mode is

enabled through the FCCR register, and use of this

mode is recommended for best performance when

denormalized results are generated.

The FPU has a separate pipeline for floating point

instruction execution. This pipeline operates in parallel

with the integer core pipeline and does not stall when

the integer pipeline stalls. This allows long-running

FPU operations, such as divide or square root, to be

partially masked by system stalls and/or other integer

unit instructions. Arithmetic instructions are always

dispatched and completed in order, but loads and

stores can complete out of order. The exception model

is “precise” at all times.

Table 3-4 contains the floating point instruction laten-

cies and repeat rates for the processor core. In this

table, 'Latency' refers to the number of FPU cycles nec

essary for the first instruction to produce the result

needed by the second instruction. The “Repeat Rate”

refers to the maximum rate at which an instruction can

be executed per FPU cycle.

TABLE 3-4: FPU INSTRUCTION

LATENCIES AND REPEAT

RATES

Op code

Latency

(FPU

Cycles)

Repeat

Rate

(FPU

Cycles)

ABS.[S,D], NEG.[S,D],

ADD.[S,D], SUB.[S,D],

C.cond.[S,D], MUL.S

4 1

MADD.S, MSUB.S,

NMADD.S, NMSUB.S,

CABS.cond.[S,D]

4 1

CVT.D.S, CVT.PS.PW,

CVT.[S,D].[W,L]

4 1

CVT.S.D,

CVT.[W,L].[S,D],

CEIL.[W,L].[S,D],

FLOOR.[W,L].[S,D],

ROUND.[W,L].[S,D],

TRUNC.[W,L].[S,D]

4 1

MOV.[S,D], MOVF.[S,D],

MOVN.[S,D],

MOVT.[S,D], MOVZ.[S,D]

4 1

MUL.D 5 2

MADD.D, MSUB.D,

NMADD.D, NMSUB.D

5 2

RECIP.S 13 10

RECIP.D 26 21

RSQRT.S 17 14

RSQRT.D 36 31

DIV.S, SQRT.S 17 14

DIV.D, SQRT.D 32 29

MTC1, DMTC1, LWC1,

LDC1, LDXC1, LUXC1,

LWXC1

4 1

MFC1, DMFC1, SWC1,

SDC1, SDXC1, SUXC1,

SWXC1

1 1

Legend: S = Single D = Double 

W = Word L = Long word