Method and apparatus for manipulating vectored data
First Claim
Patent Images
1. In a RISC-based computer processing core having a general purpose register file, a method of shifting packed data of M N-bit elements comprising steps of:
- receiving an instruction, the instruction specifying a shift amount and a shift direction; and
decoding the instruction to produce control signals;
in response to the control signals;
selecting one of the general purpose registers;
bit-level shifting the data contained therein by a first amount based on the shift amount and by at most seven bit positions to produce a bit-shifted datum;
re-ordering the bits of the bit-shifted datum to produce an intermediate result representative of a byte-level shifting of the bit-shifted datum by a second amount based on the shift amount;
producing a mask based on the shift amount; and
for each bit in the intermediate result, either producing the bit or producing a preselected bit value based on the mask to form a final result.
3 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus is disclosed for manipulating vectored data. The method includes shifting bits of packed data comprising M N-bit elements using a bit-level shift step followed by a byte-level shift step. A mask is generated and applied to the intermediate shifted result to produce the final result. A method is disclosed for conditionally transferring data from one general purpose register to another based on data in yet a third general purpose register.
107 Citations
23 Claims
-
1. In a RISC-based computer processing core having a general purpose register file, a method of shifting packed data of M N-bit elements comprising steps of:
-
receiving an instruction, the instruction specifying a shift amount and a shift direction; and
decoding the instruction to produce control signals;
in response to the control signals;
selecting one of the general purpose registers;
bit-level shifting the data contained therein by a first amount based on the shift amount and by at most seven bit positions to produce a bit-shifted datum;
re-ordering the bits of the bit-shifted datum to produce an intermediate result representative of a byte-level shifting of the bit-shifted datum by a second amount based on the shift amount;
producing a mask based on the shift amount; and
for each bit in the intermediate result, either producing the bit or producing a preselected bit value based on the mask to form a final result. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A processing core having circuitry for shifting bits of packed N-bit data by a shift amount, the circuitry comprising:
-
an input for providing a packed N-bit datum, comprising M N-bit elements;
shift circuitry having an input to receive the packed datum and effective for shifting the M×
N bits of the packed datum by the shift amount to produce a shifted output, the shift circuitry comprising a bit shifter for shifting the packed datum by an amount up to seven bit positions to produce a first result and a matrix operable to re-order at least some of the bits in the first result in any order to produce the shifted output;
mask generation logic to produce a mask, the bit-pattern of the mask based on the value of N and the shift amount;
an alternate logic value generator for producing M×
N bit values; and
selector logic effective for producing, for each bit in the shifted output, either that bit or one of the M×
N bit values based the mask bits.- View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A processing core comprising:
-
bit shifting logic in data communication with the register file, the register file effective for providing the contents of two or three registers;
byte shifting logic in data communication with a first output of the bit shifting logic;
a plurality of 2;
1 selectors, each having a first input coupled to an output of the byte shifting logic;
a sign generator in data communication with a second output of the bit shifting logic, the sign generator having an output coupled to a second input of the selector; and
a mask generator having outputs coupled to the select inputs of the selectors, the byte-shifting logic configured to receive the first output and to output at least some of the bits comprising the first output in any order, to produce the output of the byte-shifting logic. - View Dependent Claims (18, 19, 20, 21, 22, 23)
wherein MMSHLLD.W is a logical left shift of a packed 16-bit datum, wherein MMSHLRD.W is a logical right shift of a packed 16-bit datum, wherein MMSHLLD.L is a logical left shift of a packed 32-bit datum, wherein MMSHLRD.W is a logical right shift of a packed 16-bit datum wherein MMSHARD.W is an arithmetic right shift of a packed 16-bit datum, and wherein MMSHARD.L is an arithmetic right shift of a packed 16-bit datum. -
20. The processing core of claim 18 wherein the byte shifting logic is further responsive to the decoding of one of the MCNVS.WB, MCNVS.WUB, and MCNVS.LW instructions,
wherein MCNVS.WB is a conversion of signed 16-bit data to signed 8-bit values, wherein MCNVS.WUB is a conversion of signed 16-bit data to unsigned 8-bit values, and wherein MCNVS.LW is a conversion of 32-bit data to signed 16-bit values. -
21. The processing core of claim 20 wherein the byte shifting logic is further responsive to the decoding of one of the MSHFHI.B, MSHFLO.B, MSHFHI.W, MSHFLO.W, MSHFHI.L, MSHFLO.L instructions,
wherein MSHFHI.B is an interleave of 8-bit data stored in an upper portion of a first operand associated with the instruction with 8-bit data stored in an upper portion of a second operand associated with the instruction, wherein MSHFLO.B is an interleave of 8-bit data stored in a lower portion of a first operand associated with the instruction with 8-bit data stored in a lower portion of a second operand associated with the instruction, wherein MSHFHI.W is an interleave of 16-bit data stored in an upper portion of a first operand associated with the instruction with 16-bit data stored in an upper portion of a second operand associated with the instruction, wherein MSHFLO.W is an interleave of 16-bit data stored in a lower portion of a first operand associated with the instruction with 16-bit data stored in a lower portion of a second operand associated with the instruction, wherein MSHFHI.L is an interleave of 32-bit data stored in an upper portion of a first operand associated with the instruction with 32-bit data stored in an upper portion of a second operand associated with the instruction, and wherein MSHFLO.L is an interleave of 32-bit data stored in a lower portion of a first operand associated with the instruction with 32-bit data stored in a lower portion of a second operand associated with the instruction. -
22. The processing core of claim 21 wherein the byte shifting logic is further responsive to the decoding of an MPERM.W instruction,
wherein MPERM.W permutes an ordering of 16-bit data stored in a first operand associated with the instruction according to data stored in a second operand associated with the instruction. -
23. The processing core of claim 22 wherein the byte shifting logic is further responsive to the decoding of one of the MEXTR1, MEXTR2, MEXTR3, MEXTR4, MEXTR5, MEXTR6, and MEXTR7 instructions,
wherein MEXTRn extracts eight bytes of data from two concatenated operands associated with the instruction, the eight bytes being offset from the right side of the concatenated operands by “ - n”
bytes.
- n”
-
Specification