User Guide

ManualsBrandsAMD ManualsOtherAMD64 ARCHITECTURE

251

252

253

254

255

256

257

258

259

260

64-Bit Media Programming 227

24592—Rev. 3.15—November 2009 AMD64 Technology

vectors (one element is the real part, the other element is the imaginary part), there is a need to swap

the elements of one source operand to perform the multiplication, and there is a need for mixed

positive-negative accumulation to complete the parallel computation of real and imaginary results. The

PSWAPD instruction can swap elements of one source operand and the PFPNACC instruction can

perform the mixed positive-negative accumulation to complete the computation.

Reciprocal Estimation

• PFRCP—Packed Floating-Point Reciprocal Approximation

• PFRCPIT1—Packed Floating-Point Reciprocal, Iteration 1

• PFRCPIT2—Packed Floating-Point Reciprocal or Reciprocal Square Root, Iteration 2

The PFRCP instruction computes the approximate reciprocal of the single-precision floating-point

value in the low-order 32 bits of the second operand and writes the result into both doublewords of the

first operand.

The PFRCPIT1 instruction performs the first intermediate step in the Newton-Raphson iteration to

refine the reciprocal approximation produced by the PFRCP instruction. The first operand contains the

input to a previous PFRCP instruction, and the second operand contains the result of the same PFRCP

instruction.

The PFRCPIT2 instruction performs the second and final step in the Newton-Raphson iteration to

refine t he reciprocal approximation produced by the PFRCP instruction or the reciprocal square-root

approximation produced by the PFSQRT instructions. The first operand contains the result of a

previous PFRCPIT1 or PFRSQIT1 instruction, and the second operand contains the result of a PFRCP

or PFRSQRT instruction.

The PFRCP instruction can be used together with the PFRCPIT1 and PFRCPIT2 instructions to

increase the accuracy of a single-precision significand.

Reciprocal Square Root

• PFRSQRT—Packed Floating-Point Reciprocal Square Root Approximation

• PFRSQIT1—Packed Floating-Point Reciprocal Square Root, Iteration 1

The PFRSQRT instruction computes the approximate reciprocal square root of the single-precision

floating-point value in the low-order 32 bits of the second operand and writes the result into each

doubleword of the first operand. The second operand is a single-precision floating-point value with a

24-bit significand. The result written to the first operand is accurate to 15 bits. Negative operands are

treated as positive operands for purposes of reciprocal square-root computation, with the sign of the

result the same as the sign of the source operand.

The PFRSQIT1 instruction performs the first step in the Newton-Raphson iteration to refine the

reciprocal square-root approximation produced by the PFSQRT instruction. The first operand contains

the input to a previous PFRSQRT instruction, and the second operand contains the square of the result

of the same PFRSQRT instruction.