User Guide
198 64-Bit Media Programming
AMD64 Technology 24592—Rev. 3.15—November 2009
5.3.5 Branch Removal
Branching is a time-consuming operation that, unlike most 64-bit media vector operations, does not
exhibit parallel behavior (there is only one branch target, not multiple targets, per branch instruction).
In many media applications, a branch involves selecting between only a few (often only two) cases.
Such branches can be replaced with 64-bit media vector compare and vector logical instructions that
simulate predicated execution or conditional moves.
Figure 5-5 shows an example of a non-branching sequence that implements a two-way multiplexer—
one that is equivalent to the ternary operator “?:” in C and C++. The comparable code sequence is
explained in “Compare and Write Mask” on page 220.
The sequence in Figure 5-5 begins with a vector compare instruction that compares the elements of
two source operands in parallel and produces a mask vector containing elements of all 1s or 0s. This
mask vector is ANDed with one source operand and ANDed-Not with the other source operand to
isolate the desired elements of both operands. These results are then ORed to select the relevant
elements from each operand. A similar branch-removal operation can be done using floating-point
source operands.
Figure 5-5. Branch-Removal Sequence
513-127.eps
operand 1
63 0
operand 2
63 0
FFFF 0000 0000 FFFF
a3 a2 a1 a0 b3 b2 b1 b0
a3 0000 0000 a0 0000 b2 b1 0000
And And-Not
Compare
a3 b2 b1 a0
Or