User Guide
128-Bit Media and Scientific Programming 149
24592—Rev. 3.15—November 2009 AMD64 Technology
Figure 4-27. PMULUDQ Multiply Operation
See “Shift” on page 152 for shift instructions that can be used to perform multiplication and division
by powers of 2.
Multiply-Add. This instruction multiplies the elements of two source vectors and add their
intermediate results in a single operation.
• PMADDWD—Packed Multiply Words and Add Doublewords
The PMADDWD instruction multiplies each 16-bit signed value in the first operand by the
corresponding 16-bit signed value in the second operand. The instruction then adds the adjacent 32-bit
intermediate results of each multiplication, and writes the 32-bit result of each addition into the
corresponding doubleword of the destination. For vectors of n number of source elements (src), m
number of destination elements (dst), and n = 2m, the operation is:
dst[j] = ((src1[i] * src2[i]) + (src1[i+1] * src2[i+1]))
where: i = 0 to n – 1
i=2j
PMADDWD thus performs four signed multiply-adds in parallel. Figure 4-28 on page 150 shows the
operation.
513-153.eps
operand 1
result
127 0
operand 2
127 0
127 0
* *