User Guide

128-Bit Media and Scientific Programming 115
24592—Rev. 3.15—November 2009 AMD64 Technology
The sequence in Figure 4-10 begins with a vector compare instruction that compares the elements of
two source operands in parallel and produces a mask vector containing elements of all 1s or 0s. This
mask vector is ANDed with one source operand and ANDed-Not with the other source operand to
isolate the desired elements of both operands. These results are then ORed to select the relevant
elements from each operand. A similar branch-removal operation can be done using floating-point
source operands.
Figure 4-10. Branch-Removal Sequence
The min/max compare instructions, for example, are useful for clamping, such as color clamping in 3D
graphics, without the need for branching. Figure 4-11 on page 116 illustrates a move-mask instruction
(PMOVMSKB) that copies sign bits to a general-purpose register (GPR). The instruction can extract
bits from mask patterns, or zero values from quantized data, or sign bits—resulting in a byte that can
be used for data-dependent branching.
513-170.eps
operand 1 operand 2
FFFF 0000 0000 FFFFFFFF 0000 0000 FFFF
a3 a2 a1 a0a7 a6 a5 a4 b3 b2 b1 b0b7 b6 b5 b4
a3 0000 0000 a0a7 0000 0000 a4
0000 b2 b1 00000000 b6 b5 0000
And And-Not
Compare and Write Mask
a3 b2 b1 a0a7 b6 b5 a4
Or
127 0 127 0
127 0