High-performance, superscalar-based computer system with out-of-order instruction execution
First Claim
1. A superscalar microprocessor for processing instructions, the microprocessor comprising:
- an instruction fetch unit configured to fetch instructions from an instruction store according to a sequential program order;
a branch prediction circuit configured to provide a branch bias signal indicating whether a conditional branch controlled by a conditional branch instruction is predicted to be taken or not taken;
an instruction buffer coupled to receive fetched instructions from the instruction fetch unit and configured to buffer a plurality of fetched instructions, including an instruction selected according to the branch bias signal;
a plurality of functional units configured to execute instructions, thereby generating result data;
a register file including a plurality of entries configured to store data including result data generated by the plurality of functional units, wherein each of the plurality of entries is accessible by reference to a respective location in the register file;
a resource identifying circuit configured to concurrently identify execution resources for a first one and a second one of a plurality of buffered instructions, wherein the second one of the buffered instructions has a data dependency on the first one of the buffered instructions, thereby making a plurality of instructions concurrently available for issue, wherein the identified execution resources for each of the available instructions includes a functional unit capable of executing the instruction;
a register rename circuit configured to provide references to locations in the register file for logical register references included with the plurality of buffered instructions;
an issue control circuit coupled to the resource identifying circuit and configured to concurrently issue more than one of the available instructions to the functional units for execution, based on availability of the identified execution resources for each instruction and availability of respective operands for each instruction in the referenced locations in the register file, without regard to the sequential program order;
a plurality of data routing paths coupled between the plurality of functional units and the register file and configured to concurrently transfer result data from more than one of the plurality of functional units to the register file; and
bypass control logic coupled to the plurality of data routing paths and configured to distribute result data from a first one of the plurality of functional units as operand data for another one or more of the plurality of functional units via an alternate data path that bypasses the register file, wherein distributing result data via the alternate data path occurs concurrently with transferring result data to the register file.
0 Assignments
0 Petitions
Accused Products
Abstract
A high-performance, superscalar-based computer system with out-of-order instruction execution for enhanced resource utilization and performance throughput. The computer system fetches a plurality of fixed length instructions with a specified, sequential program order (in-order). The computer system includes an instruction execution unit including a register file, a plurality of functional units, and an instruction control unit for examining the instructions and scheduling the instructions for out-of-order execution by the functional units. The register file includes a set of temporary data registers that are utilized by the instruction execution control unit to receive data results generated by the functional units. The data results of each executed instruction are stored in the temporary data registers until all prior instructions have been executed, thereby retiring the executed instruction in-order.
117 Citations
59 Claims
-
1. A superscalar microprocessor for processing instructions, the microprocessor comprising:
-
an instruction fetch unit configured to fetch instructions from an instruction store according to a sequential program order; a branch prediction circuit configured to provide a branch bias signal indicating whether a conditional branch controlled by a conditional branch instruction is predicted to be taken or not taken; an instruction buffer coupled to receive fetched instructions from the instruction fetch unit and configured to buffer a plurality of fetched instructions, including an instruction selected according to the branch bias signal; a plurality of functional units configured to execute instructions, thereby generating result data; a register file including a plurality of entries configured to store data including result data generated by the plurality of functional units, wherein each of the plurality of entries is accessible by reference to a respective location in the register file; a resource identifying circuit configured to concurrently identify execution resources for a first one and a second one of a plurality of buffered instructions, wherein the second one of the buffered instructions has a data dependency on the first one of the buffered instructions, thereby making a plurality of instructions concurrently available for issue, wherein the identified execution resources for each of the available instructions includes a functional unit capable of executing the instruction; a register rename circuit configured to provide references to locations in the register file for logical register references included with the plurality of buffered instructions; an issue control circuit coupled to the resource identifying circuit and configured to concurrently issue more than one of the available instructions to the functional units for execution, based on availability of the identified execution resources for each instruction and availability of respective operands for each instruction in the referenced locations in the register file, without regard to the sequential program order; a plurality of data routing paths coupled between the plurality of functional units and the register file and configured to concurrently transfer result data from more than one of the plurality of functional units to the register file; and bypass control logic coupled to the plurality of data routing paths and configured to distribute result data from a first one of the plurality of functional units as operand data for another one or more of the plurality of functional units via an alternate data path that bypasses the register file, wherein distributing result data via the alternate data path occurs concurrently with transferring result data to the register file. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method for processing instructions in a superscalar microprocessor, the method comprising:
-
fetching instructions from an instruction store according to a sequential program order; predicting whether a conditional branch controlled by a conditional branch instruction included in the fetched instructions is taken or not taken; buffering a plurality of fetched instructions, including an instruction selected according to the prediction, in an instruction buffer; concurrently identifying execution resources for more than one of a plurality of buffered instructions, the identified execution resources for each of the more than one of the plurality of buffered instructions including a functional unit capable of executing the instruction; providing references to locations in a register file for logical register references included with the plurality of buffered instructions, wherein the register file includes a plurality of entries, each of the plurality of entries being accessible by reference to a respective location in the register file; concurrently making available for execution a plurality of instructions for which execution resources are identified and register file location references are provided; concurrently issuing more than one of the plurality of available instructions for execution by a plurality of functional units, based on availability of the identified execution resources for each available instruction and availability of respective operands for each instruction in the referenced locations in the register file, without regard to the sequential program order; executing the issued instructions in the plurality of functional units, thereby generating result data; transferring the result data from the functional units to the register file; concurrently with said act of transferring, distributing the result data from a first one of the plurality of functional units as operand data for another one or more of the plurality of functional units via a bypass data path that bypasses the register file; and retiring instructions according to the sequential program order. - View Dependent Claims (9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A superscalar microprocessor for processing instructions, the microprocessor comprising:
-
an instruction fetch unit configured to fetch instructions from an instruction store according to a sequential program order; a branch prediction circuit configured to provide a branch bias signal indicating whether a conditional branch controlled by a conditional branch instruction is predicted to be taken or not taken; an instruction buffer coupled to receive fetched instructions from the instruction fetch unit and configured to buffer a plurality of fetched instructions, including an instruction selected according to the branch bias signal; a plurality of functional units configured to execute instructions, thereby generating result data; a register file including a plurality of entries configured to store data including result data generated by the plurality of functional units, wherein each of the plurality of entries is accessible by reference to a respective location in the register file; a resource identifying circuit configured to concurrently identify execution resources for a plurality of buffered instructions, thereby making a plurality of instructions concurrently available for issue, wherein the identified execution resources for each of the available instructions includes a functional unit capable of executing the instruction; a register rename circuit configured to provide references to locations in the register file for logical register references included with the plurality of buffered instructions; an issue control circuit coupled to the resource identifying circuit and configured to concurrently issue more than one of the available instructions to the functional units for execution, based on availability of the identified execution resources for each instruction and availability of respective operands for each instruction in the referenced locations in the register file, without regard to the sequential program order; a plurality of data routing paths coupled between the plurality of functional units and the register file and configured to concurrently transfer result data from more than one of the plurality of functional units to the register file; bypass control logic coupled to the plurality of data routing paths and configured to distribute result data from a first one of the plurality of functional units as operand data for another one or more of the plurality of functional units via an alternate data path that bypasses the register file, wherein distributing result data via the alternate data path occurs concurrently with transferring result data to the register file; and retirement control logic coupled to the register file and configured to concurrently retire a plurality of instructions according to the sequential program order. - View Dependent Claims (18)
-
-
19. A data processing apparatus comprising a super scalar type microprocessor having a plurality of functional units that can execute instructions simultaneously, the microprocessor comprising:
-
a pre-fetch unit that pre-fetches a plurality of instructions from a memory in preparation for execution by one or more functional units, the plurality of instructions having a predetermined program order; a branch prediction circuit configured to provide a branch bias signal indicating whether a conditional branch controlled by a conditional branch instruction is predicted to be taken or not taken; a buffer that holds a plurality of instruction groups, including one or more instruction groups pre-fetched by the pre-fetch unit according to the branch bias signal; a decoder that simultaneously decodes a plurality of instructions from an instruction group held in the buffer; a register file including a plurality of registers used in the one or more functional units executing the plurality of decoded instructions, the plurality of registers including a temporary buffer that stores results from execution of instructions outside the predetermined program order; a dependency check unit that checks for a dependency relation between the plurality of instructions output from the decoder, on the basis of use conditions stored in a register; an instruction unit that allocates an instruction to a functional unit so that the instruction executes outside the predetermined program order after the instruction is judged by the dependency check unit not to be subject to restriction due to a dependency, wherein when executing instructions outside the predetermined program order, the microprocessor uses the temporary buffer; and a retirement unit that specifies a register in which to store a result of executing the instruction outside the predetermined program order, wherein the retirement unit retires the instruction in program order after the instruction is completed, wherein when completing the instructions executed outside the predetermined program order, contents of the temporary buffer are written in a corresponding register to retire the instructions. - View Dependent Claims (20)
-
-
21. A superscalar microprocessor for executing instructions, the microprocessor comprising:
-
an instruction fetch unit configured to fetch instructions from an instruction store according to a sequential program order; and an instruction execution unit configured to concurrently receive a set of from 1 to a maximum number (N) of instructions from the instruction fetch unit, the instruction execution unit including; an instruction buffer configured to store instruction information for each instruction received from the instruction fetch unit, wherein the instruction buffer has sufficient capacity to store the instruction information for at least twice the number N of instructions; a register file comprising a plurality of temporary buffers and a plurality of retired registers, wherein the temporary buffers are arranged in a plurality of groups of temporary buffers, each group of temporary buffers including N of the temporary buffers; renaming logic configured to concurrently establish an association between each instruction in a set of instructions concurrently received from the instruction fetch unit and a respective one of the temporary buffers in a selected one of the groups of temporary buffers, wherein a position of each instruction within the set of instructions determines which one of the temporary buffers in the selected group of temporary buffers is associated with that instruction; a plurality of functional units configured to execute instructions, thereby generating result data; an issue control circuit configured to concurrently issue more than one of the instructions for which instruction information is stored in the instruction buffer to the functional units for execution, the issue control circuit being further configured to issue at least some of the instructions out of the sequential program order; a plurality of data routing paths coupled between the functional units and the register file and configured to transfer result data from more than one of the functional units to the temporary buffers concurrently; and retirement control logic coupled to the register file and configured to retire instructions according to the sequential program order, wherein the retirement control logic is further configured to concurrently retire all of the instructions in a set of instructions after all of the instructions in that set of instructions have completed. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40)
-
-
41. A method for executing instructions in a superscalar microprocessor, the method comprising:
-
fetching instructions from an instruction store according to a sequential program order; concurrently delivering a set of from 1 to a maximum number (N) of fetched instructions to an instruction execution unit, wherein the instruction execution unit includes a register file comprising a plurality of temporary buffers and a plurality of retired registers, wherein the temporary buffers are arranged in a plurality of groups of temporary buffers, each group of temporary buffers including N of the temporary buffers; storing instruction information for each instruction in the set of delivered instructions in an instruction buffer of the instruction execution unit, wherein the instruction buffer has sufficient capacity to store the instruction information for at least twice the number N of instructions; concurrently establishing an association between each instruction in the set of instructions delivered by the instruction fetch unit and a respective one of the temporary buffers in a selected one of the groups of temporary buffers, wherein a position of each instruction within the set of instructions determines which one of the temporary buffers in the selected group of temporary buffers is associated with that instruction; concurrently issuing more than one of the instructions for which instruction information is stored in the instruction buffer to a plurality of functional units, wherein at least some of the instructions are issued out of the sequential program order; executing the issued instructions in the plurality of functional units, thereby generating result data; concurrently transferring the result data from more than one of the plurality of functional units to the temporary buffers; and concurrently retiring all of the instructions in the set of instructions after all of the instructions in the set of instructions have completed. - View Dependent Claims (42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59)
-
Specification