Intel 64 and IA-32 Architectures Software Developers Manual Volume 2B, Instruction Set Reference, N-Z
4-92 Vol. 2B
INSTRUCTION SET REFERENCE, N-Z
Operation
PMADDWD instruction with 64-bit operands:
DEST[31:0] ← (DEST[15:0]
∗ SRC[15:0]) + (DEST[31:16] ∗ SRC[31:16]);
DEST[63:32] ← (DEST[47:32]
∗ SRC[47:32]) + (DEST[63:48] ∗ SRC[63:48]);
PMADDWD instruction with 128-bit operands:
DEST[31:0] ← (DEST[15:0]
∗ SRC[15:0]) + (DEST[31:16] ∗ SRC[31:16]);
DEST[63:32] ← (DEST[47:32]
∗ SRC[47:32]) + (DEST[63:48] ∗ SRC[63:48]);
DEST[95:64] ← (DEST[79:64]
∗ SRC[79:64]) + (DEST[95:80] ∗ SRC[95:80]);
DEST[127:96] ← (DEST[111:96]
∗ SRC[111:96]) + (DEST[127:112] ∗ SRC[127:112]);
Intel C/C++ Compiler Intrinsic Equivalent
PMADDWD __m64 _mm_madd_pi16(__m64 m1, __m64 m2)
PMADDWD __m128i _mm_madd_epi16 ( __m128i a, __m128i b)
Flags Affected
None.
Numeric Exceptions
None.
Protected Mode Exceptions
#GP(0) If a memory operand effective address is outside the CS, DS,
ES, FS, or GS segment limit.
(128-bit operations only) If a memory operand is not aligned on
a 16-byte boundary, regardless of segment.
#SS(0) If a memory operand effective address is outside the SS
segment limit.
Figure 4-2. PMADDWD Execution Model Using 64-bit Operands
X3 X2 X1 X0
X3 ∗ Y3 X2 ∗ Y2 X1 ∗ Y1 X0 ∗ Y0
SRC
DEST
DEST
Y3 Y2 Y1 Y0
(X1∗Y1) + (X0∗Y0)
(X3∗Y3) + (X2∗Y2)
TEMP