UNPACKING PACKED DATA IN MULTIPLE LANES
First Claim
1. A method comprising:
- receiving an instruction, the instruction indicating a first operand and a second operand, each of the first and second operands having a plurality of packed data elements that correspond in respective positions, a first subset of the packed data elements of the first operand and a first subset of the packed data elements of the second operand each corresponding to a first lane, and a second subset of the packed data elements of the first operand and a second subset of the packed data elements of the second operand each corresponding to a second lane; and
storing a result in response to the instruction, the result including;
(1) in the first lane, only lowest order data elements from the first subset of the first operand interleaved with corresponding lowest order data elements from the first subset of the second operand; and
(2) in the second lane, only highest order data elements from the second subset of the first operand interleaved with corresponding highest order data elements from the second subset of the second operand.
1 Assignment
0 Petitions
Accused Products
Abstract
Receiving an instruction indicating first and second operands. Each of the operands having packed data elements that correspond in respective positions. A first subset of the data elements of the first operand and a first subset of the data elements of the second operand each corresponding to a first lane. A second subset of the data elements of the first operand and a second subset of the data elements of the second operand each corresponding to a second lane. Storing result, in response to instruction, including: (1) in first lane, only lowest order data elements from first subset of first operand interleaved with corresponding lowest order data elements from first subset of second operand; and (2) in second lane, only highest order data elements from second subset of first operand interleaved with corresponding highest order data elements from second subset of second operand.
-
Citations
27 Claims
-
1. A method comprising:
-
receiving an instruction, the instruction indicating a first operand and a second operand, each of the first and second operands having a plurality of packed data elements that correspond in respective positions, a first subset of the packed data elements of the first operand and a first subset of the packed data elements of the second operand each corresponding to a first lane, and a second subset of the packed data elements of the first operand and a second subset of the packed data elements of the second operand each corresponding to a second lane; and storing a result in response to the instruction, the result including;
(1) in the first lane, only lowest order data elements from the first subset of the first operand interleaved with corresponding lowest order data elements from the first subset of the second operand; and
(2) in the second lane, only highest order data elements from the second subset of the first operand interleaved with corresponding highest order data elements from the second subset of the second operand. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. An apparatus comprising:
-
an execution unit that is operable as a result of an instruction to store a result, in which the instruction has a first field to indicate a first operand and a second field to indicate a second operand, each of the first and second operands to have a plurality of packed data elements that are to correspond in respective positions, in which a first subset of the packed data elements of the first operand and a first subset of the packed data elements of the second operand are each to correspond to a first lane, and in which a second subset of the packed data elements of the first operand and a second subset of the packed data elements of the second operand are each to correspond to a second lane, in which the result that is to be stored is to include;
(1) in the first lane, only lowest order data elements from the first subset of the first operand interleaved with corresponding lowest order data elements from the first subset of the second operand; and
(2) in the second lane, only highest order data elements from the second subset of the first operand interleaved with corresponding highest order data elements from the second subset of the second operand. - View Dependent Claims (9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. An apparatus comprising:
-
a first 256-bit register to store a first source operand including a first data element represented by bits 0 through 31, a second data element represented by bits 32 through 63, a third data element represented by bits 64 through 95, a fourth data element represented by bits 96 through 127, a fifth data element represented by bits 128 through 159, a sixth data element represented by bits 160 through 191, a seventh data element represented by bits 192 through 223, and an eighth data element represented by bits 224 through 255; a second 256-bit register to store a second source operand including a ninth data element represented by bits 0 through 31, a tenth data element represented by bits 32 through 63, an eleventh data element represented by bits 64 through 95, a twelfth data element represented by bits 96 through 127, a thirteenth data element represented by bits 128 through 159, a fourteenth data element represented by bits 160 through 191, a fifteenth data element represented by bits 192 through 223, and a sixteenth data element represented by bits 224 through 255; an execution unit as a result of an instruction to store a result, the result that is to be stored to include the first data element to be stored to bits 0 through 31 of a destination register, the ninth data element to be stored to bits 32 through 63 of the destination register, the second data element to be stored to bits 64 through 95 of the destination register, the tenth data element to be stored to bits 96 through 127 of the destination register, the seventh data element to be stored to bits 128 through 159 of the destination register, the fifteenth data element to be stored to bits 160 through 191 of the destination register, the eighth data element to be stored to bits 192 through 223 of the destination register, and the sixteenth data element to be stored to bits 224 through 255 of the destination register. - View Dependent Claims (21)
-
-
22. An apparatus comprising:
-
a first 256-bit register to store a first source operand including a first data element represented by bits 0 through 31, a second data element represented by bits 32 through 63, a third data element represented by bits 64 through 95, a fourth data element represented by bits 96 through 127, a fifth data element represented by bits 128 through 159, a sixth data element represented by bits 160 through 191, a seventh data element represented by bits 192 through 223, and an eighth data element represented by bits 224 through 255; a second 256-bit register to store a second source operand including a ninth data element represented by bits 0 through 31, a tenth data element represented by bits 32 through 63, an eleventh data element represented by bits 64 through 95, a twelfth data element represented by bits 96 through 127, a thirteenth data element represented by bits 128 through 159, a fourteenth data element represented by bits 160 through 191, a fifteenth data element represented by bits 192 through 223, and a sixteenth data element represented by bits 224 through 255; an execution unit as a result of an instruction to store a result, the result that is to be stored to include the third data element to be stored to bits 0 through 31 of a destination register, the eleventh data element to be stored to bits 32 through 63 of the destination register, the fourth data element to be stored to bits 64 through 95 of the destination register, the twelfth data element to be stored to bits 96 through 127 of the destination register, the fifth data element to be stored to bits 128 through 159 of the destination register, the thirteenth data element to be stored to bits 160 through 191 of the destination register, the sixth data element to be stored to bits 192 through 223 of the destination register, and the fourteenth data element to be stored to bits 224 through 255 of the destination register. - View Dependent Claims (23)
-
-
24. A system comprising:
-
an interconnect; a processor coupled with the interconnect, the processor including; at least one of an instruction decoder, an instruction translator, and an instruction emulator, the at least one implemented in hardware, software, firmware, or a combination thereof, to receive an instruction, the instruction having a first field to indicate a first operand and a second field to indicate a second operand, each of the first and second operands to have a plurality of packed data elements that are to correspond in respective positions, in which a first subset of the packed data elements of the first operand and a first subset of the packed data elements of the second operand are each to correspond to a first lane, and in which a second subset of the packed data elements of the first operand and a second subset of the packed data elements of the second operand are each to correspond to a second lane; and a circuit responsive to said at least one receiving the instruction to store a result, the result to include;
(1) in the first lane, only lowest order data elements from the first subset of the first operand interleaved with corresponding lowest order data elements from the first subset of the second operand; and
(2) in the second lane, only highest order data elements from the second subset of the first operand interleaved with corresponding highest order data elements from the second subset of the second operand; anda dynamic random access memory (DRAM) coupled with the interconnect. - View Dependent Claims (25)
-
-
26. An article of manufacture comprising:
-
a machine-readable medium to provide an instruction, the instruction including a first field to indicate a first operand and a second field to indicate a second operand, each of the first and second operands to have a plurality of packed data elements that are to correspond in respective positions, a first subset of the packed data elements of the first operand and a first subset of the packed data elements of the second operand to each correspond to a first lane, and a second subset of the packed data elements of the first operand and a second subset of the packed data elements of the second operand to each correspond to a second lane, and the instruction if processed by a machine to cause the machine to perform operations comprising storing a result, the result including;
(1) in the first lane, only lowest order data elements from the first subset of the first operand interleaved with corresponding lowest order data elements from the first subset of the second operand; and
(2) in the second lane, only highest order data elements from the second subset of the first operand interleaved with corresponding highest order data elements from the second subset of the second operand. - View Dependent Claims (27)
-
Specification