User Guide

170 128-Bit Media and Scientific Programming
AMD64 Technology 24592—Rev. 3.15—November 2009
Division
DIVPS—Divide Packed Single-Precision Floating-Point
DIVPD—Divide Packed Double-Precision Floating-Point
DIVSS—Divide Scalar Single-Precision Floating-Point
DIVSD—Divide Scalar Double-Precision Floating-Point
The DIVPS instruction divides each of the four single-precision floating-point values in the first
operand by the corresponding single-precision floating-point value in the second operand and writes
the result in the corresponding quadword of the destination. The DIVPD instruction performs an
analogous operation for two double-precision floating-point values. For vectors of n number of
elements, the operations are:
operand1[i] = operand1[i] ÷ operand2[i]
where: i = 0 to n 1
The DIVSS instruction divides the single-precision floating-point value in the low-order doubleword
of the first operand by the single-precision floating-point value in the low-order doubleword of the
second operand and writes the result in the low-order doubleword of the destination. The three high-
order doublewords of the destination are not modified.
The DIVSD instruction divides the double-precision floating-point value in the low-order quadword of
the first operand by the double-precision floating-point value in the low-order quadword of the second
operand and writes the result in the low-order quadword of the destination. The high-order quadword
of the destination is not modified.
If accuracy requirements allow, convert floating-point division by a constant to a multiply by the
reciprocal. Divisors that are powers of two and their reciprocals are exactly representable, and
therefore do not cause an accuracy issue, except for the rare cases in which the reciprocal overflows or
underflows.
Square Root
SQRTPS—Square Root Packed Single-Precision Floating-Point
SQRTPD—Square Root Packed Double-Precision Floating-Point
SQRTSS—Square Root Scalar Single-Precision Floating-Point
SQRTSD—Square Root Scalar Double-Precision Floating-Point
The SQRTPS instruction computes the square root of each of four single-precision floating-point
values in the second operand (an XMM register or 128-bit memory location) and writes the result in
the corresponding doubleword of the destination. The SQRTPD instruction performs an analogous
operation for two double-precision floating-point values.
The SQRTSS instruction computes the square root of the low-order single-precision floating-point
value in the second operand (an XMM register or 32-bit memory location) and writes the result in the