User Guide

170 128-Bit Media and Scientific Programming

AMD64 Technology 24592—Rev. 3.15—November 2009

Division

• DIVPS—Divide Packed Single-Precision Floating-Point

• DIVPD—Divide Packed Double-Precision Floating-Point

• DIVSS—Divide Scalar Single-Precision Floating-Point

• DIVSD—Divide Scalar Double-Precision Floating-Point

The DIVPS instruction divides each of the four single-precision floating-point values in the first

operand by the corresponding single-precision floating-point value in the second operand and writes

the result in the corresponding quadword of the destination. The DIVPD instruction performs an

analogous operation for two double-precision floating-point values. For vectors of n number of

elements, the operations are:

operand1[i] = operand1[i] ÷ operand2[i]

where: i = 0 to n – 1

The DIVSS instruction divides the single-precision floating-point value in the low-order doubleword

of the first operand by the single-precision floating-point value in the low-order doubleword of the

second operand and writes the result in the low-order doubleword of the destination. The three high-

order doublewords of the destination are not modified.

The DIVSD instruction divides the double-precision floating-point value in the low-order quadword of

the first operand by the double-precision floating-point value in the low-order quadword of the second

operand and writes the result in the low-order quadword of the destination. The high-order quadword

of the destination is not modified.

If accuracy requirements allow, convert floating-point division by a constant to a multiply by the

reciprocal. Divisors that are powers of two and their reciprocals are exactly representable, and

therefore do not cause an accuracy issue, except for the rare cases in which the reciprocal overflows or

underflows.

Square Root

• SQRTPS—Square Root Packed Single-Precision Floating-Point

• SQRTPD—Square Root Packed Double-Precision Floating-Point

• SQRTSS—Square Root Scalar Single-Precision Floating-Point

• SQRTSD—Square Root Scalar Double-Precision Floating-Point

The SQRTPS instruction computes the square root of each of four single-precision floating-point

values in the second operand (an XMM register or 128-bit memory location) and writes the result in

the corresponding doubleword of the destination. The SQRTPD instruction performs an analogous

operation for two double-precision floating-point values.

The SQRTSS instruction computes the square root of the low-order single-precision floating-point

value in the second operand (an XMM register or 32-bit memory location) and writes the result in the