User Guide
142 128-Bit Media and Scientific Programming
AMD64 Technology 24592—Rev. 3.15—November 2009
into interleaved words, interleaved doublewords, and interleaved quadwords in the destination
operand.
The PUNPCKLBW, PUNPCKLWD, PUNPCKLDQ, and PUNPCKLQDQ instructions are analogous
to their high-element counterparts except that they take elements from the low quadword of each
source vector and ignore elements in the high quadword. Depending on the hardware implementation,
if the source operand for PUNPCKLx and PUNPCKHx instructions is in memory, only the low 64 bits
of the operand may be loaded.
Figure 4-21 shows an example of the PUNPCKLWD instruction. The elements are taken from the low
half of the source operands. In this register image, elements from operand2 are placed to the left of
elements from operand1.
Figure 4-21. PUNPCKLWD Unpack and Interleave Operation
If operand 2 is a vector consisting of all zero-valued elements, the unpack instructions perform the
function of expanding vector elements of 1x size into vector elements of 2x size. Conversion from
lower-to-higher precision is often needed, for example, prior to multiplication operations in which the
higher-precision format is used for source operands in order to prevent possible overflow during
multiplication.
If both source operands are of identical value, the unpack instructions can perform the function of
duplicating adjacent elements in a vector.
The PUNPCKx instructions can be used in a repeating sequence to transpose rows and columns of an
array. For example, such a sequence could begin with PUNPCKxWD and be followed by
PUNPCKxQD. These instructions can also be used to convert pixel representation from RGB format
to color-plane format, or to interleave interpolation elements into a vector.
As noted above, and depending on the hardware implementation, the width of the memory access
performed by the memory-operand forms of PUNPCKLBW, PUNPCKLWD, PUNPCKLDQ, and
513-149.eps
operand 1
result
127 0
operand 2
127 0
127 0
. .. .