User Guide

64-Bit Media Programming 227
24592—Rev. 3.15—November 2009 AMD64 Technology
vectors (one element is the real part, the other element is the imaginary part), there is a need to swap
the elements of one source operand to perform the multiplication, and there is a need for mixed
positive-negative accumulation to complete the parallel computation of real and imaginary results. The
PSWAPD instruction can swap elements of one source operand and the PFPNACC instruction can
perform the mixed positive-negative accumulation to complete the computation.
Reciprocal Estimation
PFRCP—Packed Floating-Point Reciprocal Approximation
PFRCPIT1—Packed Floating-Point Reciprocal, Iteration 1
PFRCPIT2—Packed Floating-Point Reciprocal or Reciprocal Square Root, Iteration 2
The PFRCP instruction computes the approximate reciprocal of the single-precision floating-point
value in the low-order 32 bits of the second operand and writes the result into both doublewords of the
first operand.
The PFRCPIT1 instruction performs the first intermediate step in the Newton-Raphson iteration to
refine the reciprocal approximation produced by the PFRCP instruction. The first operand contains the
input to a previous PFRCP instruction, and the second operand contains the result of the same PFRCP
instruction.
The PFRCPIT2 instruction performs the second and final step in the Newton-Raphson iteration to
refine t he reciprocal approximation produced by the PFRCP instruction or the reciprocal square-root
approximation produced by the PFSQRT instructions. The first operand contains the result of a
previous PFRCPIT1 or PFRSQIT1 instruction, and the second operand contains the result of a PFRCP
or PFRSQRT instruction.
The PFRCP instruction can be used together with the PFRCPIT1 and PFRCPIT2 instructions to
increase the accuracy of a single-precision significand.
Reciprocal Square Root
PFRSQRT—Packed Floating-Point Reciprocal Square Root Approximation
PFRSQIT1—Packed Floating-Point Reciprocal Square Root, Iteration 1
The PFRSQRT instruction computes the approximate reciprocal square root of the single-precision
floating-point value in the low-order 32 bits of the second operand and writes the result into each
doubleword of the first operand. The second operand is a single-precision floating-point value with a
24-bit significand. The result written to the first operand is accurate to 15 bits. Negative operands are
treated as positive operands for purposes of reciprocal square-root computation, with the sign of the
result the same as the sign of the source operand.
The PFRSQIT1 instruction performs the first step in the Newton-Raphson iteration to refine the
reciprocal square-root approximation produced by the PFSQRT instruction. The first operand contains
the input to a previous PFRSQRT instruction, and the second operand contains the square of the result
of the same PFRSQRT instruction.