Method and apparatus for performing improved group instructions
First Claim
1. A programmable processor comprising:
- an instruction path and a data path;
a register file comprising a plurality of registers coupled to the data path; and
an execution unit coupled to the instruction and data paths, that is operable to decode and execute group instructions received from the instruction path, and on an instruction-by-instruction basis, dynamically partition data from an operand register in the plurality of registers into multiple data elements having the same elemental width such that a total aggregate width of the multiple data elements equals a width of the operand register;
wherein the execution unit is capable of executing a first group operation that operates on data elements having a first elemental width and, immediately following execution of the first group operation, execute a second group operation that operates on data elements having a second elemental width twice as large as the first elemental width; and
wherein the execution unit is also capable of executing both group integer arithmetic operations and group floating-point arithmetic operations, wherein for the group integer arithmetic operations multiple pairs of integer data elements from a pair of operand registers are operated on in parallel to produce a catenated result comprising a plurality of individual integer results and for the group floating-point arithmetic operations multiple pairs of floating-point data elements from a pair of operand registers are arithmetically operated on in parallel to produce a catenated result comprising a plurality of individual floating-point results.
0 Assignments
0 Petitions
Accused Products
Abstract
Systems and apparatuses are presented relating a programmable processor comprising an execution unit that is operable to decode and execute instructions received from an instruction path and partition data stored in registers in the register file into multiple data elements, the execution unit capable of executing a plurality of different group floating-point and group integer arithmetic operations that each arithmetically operates on multiple data elements stored registers in a register file to produce a catenated result that is returned to a register in the register file, wherein the catenated result comprises a plurality of individual results, wherein the execution unit is capable of executing group data handling operations that re-arrange data elements in different ways in response to data handling instructions.
-
Citations
26 Claims
-
1. A programmable processor comprising:
-
an instruction path and a data path; a register file comprising a plurality of registers coupled to the data path; and an execution unit coupled to the instruction and data paths, that is operable to decode and execute group instructions received from the instruction path, and on an instruction-by-instruction basis, dynamically partition data from an operand register in the plurality of registers into multiple data elements having the same elemental width such that a total aggregate width of the multiple data elements equals a width of the operand register; wherein the execution unit is capable of executing a first group operation that operates on data elements having a first elemental width and, immediately following execution of the first group operation, execute a second group operation that operates on data elements having a second elemental width twice as large as the first elemental width; and wherein the execution unit is also capable of executing both group integer arithmetic operations and group floating-point arithmetic operations, wherein for the group integer arithmetic operations multiple pairs of integer data elements from a pair of operand registers are operated on in parallel to produce a catenated result comprising a plurality of individual integer results and for the group floating-point arithmetic operations multiple pairs of floating-point data elements from a pair of operand registers are arithmetically operated on in parallel to produce a catenated result comprising a plurality of individual floating-point results. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A programmable processor comprising:
-
an instruction path and a data path; a register file comprising a plurality of registers coupled to the data path; and an execution unit coupled to the instruction and data paths, that is operable to decode and execute group instructions received from the instruction path and partition data from an operand register in the plurality of registers into multiple data elements having the same elemental width such that a total aggregate width of the multiple data elements equals a width of the operand register, wherein the execution unit can dynamically vary the elemental width of partitioned data based on information provided by an instruction being executed on an instruction-by-instruction basis, and wherein the execution unit is capable of, while operating in a single mode of operation; (i) executing both group integer and group floating-point arithmetic operations in which multiple pairs of integer and floating-point data elements, respectively, from a pair of operand registers are operated on in parallel to produce a catenated result comprising a plurality of individual integer and floating-point results, respectively; (ii) executing a first group operation that operates on data elements having a first elemental width and, immediately following execution of the first group operation, execute a second group operation that operates on data elements having a second elemental width twice as large as the first elemental width; and (iii) executing a group floating point operation that operates on multiple pairs of data elements having the first elemental width and, immediately following execution of the group floating point operation, execute a scalar floating-point operation that operates on a single pair of data elements having the second elemental width. - View Dependent Claims (22, 23, 24)
-
-
25. A programmable processor comprising:
-
an instruction path and a data path; a register file comprising a plurality of registers coupled to the data path; and an execution unit coupled to the instruction and data paths, that is operable to decode and execute group instructions received from the instruction path and partition data from an operand register in the plurality of registers into multiple data elements having the same elemental width such that a total aggregate width of the multiple data elements equals a width of the operand register, wherein the execution unit can dynamically vary the elemental width and data type of partitioned data on an instruction-by-instruction basis and is capable of; (i) executing both group integer and group floating-point arithmetic operations in which multiple pairs of integer and floating-point data elements, respectively, from a pair of operand registers are operated on in parallel to produce a catenated result comprising a plurality of individual integer and floating-point results, respectively; (ii) executing a first group operation that operates on data elements having a first elemental width and, immediately following execution of the first group operation, execute a second group operation that operates on data elements having a second elemental width twice as large as the first elemental width; (iii) executing a group floating point operation that operates on multiple pairs of data elements having the first elemental width and, immediately following execution of the group floating point operation, execute a scalar floating-point operation that operates on a single pair of data elements having the second elemental width; and (iv) executing group integer and group floating-point instructions while making available precise exceptions. - View Dependent Claims (26)
-
Specification