Single Precision Vector Permute Immediate with "Word" Vector Write Mask
First Claim
1. A method for storing data in a target register, comprising:
- receiving a permute instruction specifying at least one source register, the target register, and a write mask, wherein the write mask identifies one or more locations of the target register for writing data; and
in response to receiving the permute instruction, transferring data from at least one location of the at least one source register to the one or more locations of the target register identified by the write mask.
4 Assignments
0 Petitions
Accused Products
Abstract
The present invention is generally related to the field of image processing, and more specifically to an instruction set for processing images. Vector processing may involve performing a plurality of permute operations to arrange vector operands in desired locations of a register prior to performing vector operation, for example, a cross product. The permute instructions may be dependent on one another and may require the use of temporary registers. Embodiments of the invention provide a permute instruction wherein a mask field may be used to specify a particular location of a target register in which to transfer data, thereby reducing the number of instructions for arranging data, reducing dependencies between instructions, and the usage of temporary registers.
53 Citations
18 Claims
-
1. A method for storing data in a target register, comprising:
-
receiving a permute instruction specifying at least one source register, the target register, and a write mask, wherein the write mask identifies one or more locations of the target register for writing data; and in response to receiving the permute instruction, transferring data from at least one location of the at least one source register to the one or more locations of the target register identified by the write mask. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method for assembling data in a target register, comprising:
-
generating a plurality of permute instructions, each permute instruction specifying at least one source register and the target register; setting a mask field in each of the plurality of permute instructions, wherein the mask field identifies one or more locations of the target register for receiving data from the source register; and executing the permute instructions to assemble data from the source registers to the locations in the target register identified by the mask field. - View Dependent Claims (8, 9, 10, 11, 12, 14, 15, 16, 17, 18)
-
-
13. A system, comprising a plurality of processors communicably coupled with one another, wherein each processor comprises:
-
a register file comprising a plurality of registers; and at least one vector unit, wherein the vector unit is configured to receive a permute instruction specifying at least one source register, a target register, and a write mask, the write mask identifying one or more locations in the target register, and execute the permute instruction by transferring data from at least one location of the at least one source register to the one or more locations of the target register identified by the write mask.
-
Specification