User Guide

128-Bit Media and Scientific Programming 167
24592—Rev. 3.15—November 2009 AMD64 Technology
The ADDPS instruction adds each of four single-precision floating-point values in the first operand to
the corresponding single-precision floating-point values in the second operand and writes the result in
the corresponding quadword of the destination. The ADDPD instruction performs an analogous
operation for two double-precision floating-point values.
Figure 4-35 on page 167 shows a typical arithmetic operation on vectors of floating-point single-
precision elements—in this case an ADDPS instruction. The instruction performs four arithmetic
operations in parallel.
Figure 4-35. ADDPS Arithmetic Operation
The ADDSS instruction adds the single-precision floating-point value in the low-order doubleword of
the first operand to the single-precision floating-point value in the low-order doubleword of the second
operand and writes the result in the low-order doubleword of the destination. The three high-order
doublewords of the destination are not modified.
The ADDSD instruction adds the double-precision floating-point value in the low-order quadword of
the first operand to the double-precision floating-point value in the low-order quadword of the second
operand and writes the result in the low-order quadword of the destination. The high-order quadword
of the destination is not modified.
Horizontal Addition
HADDPS—Horizontal Add Packed Single-Precision Floating-Point
HADDPD—Horizontal Subtract Packed Double-Precision Floating-Point
The HADDPS instruction adds the single-precision floating point values in the first and second
doublewords of the destination operand and stores the sum in the first doubleword of the destination
operand. It adds the single-precision floating point values in the third and fourth doublewords of the
destination operand and stores the sum in the second doubleword of the destination operand. It adds
the single-precision floating point values in the first and second doublewords of the source operand
and stores the sum in the third doubleword of the destination operand. It adds single-precision floating
513-164.eps
. .. .
. .
operation
operation
result
operand 1
127 0
127 0
operand 2
127 0
FP single FP single FP single FP single FP single FP single FP single FP single
FP single FP single FP single FP single