Intel 64 and IA-32 Architectures Software Developers Manual Volume 2B, Instruction Set Reference, N-Z
4-76 Vol. 2B
INSTRUCTION SET REFERENCE, N-Z
PHADDSW — Packed Horizontal Add and Saturate
Description
PHADDSW adds two adjacent signed 16-bit integers horizontally from the source and
destination operands and saturates the signed results; packs the signed, saturated
16-bit results to the destination operand (first operand) Both operands can be MMX
or XMM registers. When the source operand is a 128-bit memory operand, the
operand must be aligned on a 16-byte boundary or a general-protection exception
(#GP) will be generated.
In 64-bit mode, use the REX prefix to access additional registers.
Operation
PHADDSW with 64-bit operands:
mm1[15-0] = SaturateToSignedWord((mm1[31-16] + mm1[15-0]);
mm1[31-16] = SaturateToSignedWord(mm1[63-48] + mm1[47-32]);
mm1[47-32] = SaturateToSignedWord(mm2/m64[31-16] + mm2/m64[15-0]);
mm1[63-48] = SaturateToSignedWord(mm2/m64[63-48] + mm2/m64[47-32]);
PHADDSW with 128-bit operands :
xmm1[15-0]= SaturateToSignedWord(xmm1[31-16] + xmm1[15-0]);
xmm1[31-16] = SaturateToSignedWord(xmm1[63-48] + xmm1[47-32]);
xmm1[47-32] = SaturateToSignedWord(xmm1[95-80] + xmm1[79-64]);
xmm1[63-48] = SaturateToSignedWord(xmm1[127-112] + xmm1[111-96]);
xmm1[79-64] = SaturateToSignedWord(xmm2/m128[31-16] + xmm2/m128[15-0]);
xmm1[95-80]
= SaturateToSignedWord(xmm2/m128[63-48] + xmm2/m128[47-32]);
xmm1[111-96] = SaturateToSignedWord(xmm2/m128[95-80] + xmm2/m128[79-64]);
xmm1[127-112] = SaturateToSignedWord(xmm2/m128[127-112] + xmm2/m128[111-96]);
Intel C/C++ Compiler Intrinsic Equivalent
PHADDSW __m64 _mm_hadds_pi16 (__m64 a, __m64 b)
PHADDSW __m128i _mm_hadds_epi16 (__m128i a, __m128i b)
Opcode Instruction
64-Bit
Mode
Compat/
Leg Mode Description
0F 38 03 /r PHADDSW mm1,
mm2/m64
Valid Valid Add 16-bit signed integers
horizontally, pack saturated
integers to MM1.
66 0F 38 03
/r
PHADDSW
xmm1,
xmm2/m128
Valid Valid Add 16-bit signed integers
horizontally, pack saturated
integers to XMM1.