User Guide
128-Bit Media and Scientific Programming 153
24592—Rev. 3.15—November 2009 AMD64 Technology
operand1[i] = operand1[i] ÷ 2
operand2
where: i = 0 to n – 1
The PSRLDQ instruction differs from the other three right-shift instructions because it operates on
bytes rather than bits. It right-shifts the 128-bit (double quadword) value in an XMM register by the
number of bytes specified in an immediate byte value. PSRLDQ can be used, for example, to move the
high 8 bytes of an XMM register to the low 8 bytes of the register. In some implementations, however,
PUNPCKHQDQ may be a better choice for this operation.
Right Arithmetic Shift.
• PSRAW—Packed Shift Right Arithmetic Words
• PSRAD—Packed Shift Right Arithmetic Doublewords
The PSRAx instructions right-shift each of the 16-bit (PSRAW) or 32-bit (PSRAD) values in the first
operand by the number of bits specified in the second operand. The instructions then write each shifted
value into the corresponding, same-sized element of the destination. The high-order bits that are
emptied by the shift operation are filled with the sign bit of the initial value.
In integer arithmetic, right arithmetic shifts effectively divide signed operands by positive powers of 2.
Thus, for vectors of n number of elements, the operation is:
operand1[i] = operand1[i] ÷ 2
operand2
where: i = 0 to n – 1
4.5.7 Compare
The integer vector-compare instructions compare two operands, and they either write a mask or they
write the maximum or minimum value.
Compare and Write Mask.
• PCMPEQB—Packed Compare Equal Bytes
• PCMPEQW—Packed Compare Equal Words
• PCMPEQD—Packed Compare Equal Doublewords
• PCMPGTB—Packed Compare Greater Than Signed Bytes
• PCMPGTW—Packed Compare Greater Than Signed Words
• PCMPGTD—Packed Compare Greater Than Signed Doublewords
The PCMPEQx and PCMPGTx instructions compare corresponding bytes, words, or doublewords in
the two source operands. The instructions then write a mask of all 1s or 0s for each compare into the
corresponding, same-sized element of the destination. Figure 4-30 on page 154 shows a PCMPEQB
compare operation. It performs 16 compares in parallel.