Implementation of an efficient instruction fetch pipeline utilizing a trace cache
First Claim
1. In a computer architecture for executing computer instructions, a method to reduce the occurrence of instruction execution pipeline stalls due to branch instructions and jump instructions, said method comprising:
- providing an instruction transfer bandwidth into a trace cache of said computer architecture that is greater than an instruction execution bandwidth of said computer architecture;
utilizing said instruction transfer bandwidth to provide both the taken and not taken paths of a branch to the trace cache;
feeding back branch results of executed branch instructions to control the transmission of correct next instructions into the execution pipeline; and
using branch delay slots to postpone the execution of said next instruction during said transmission such that the occurrence of instruction execution stalls within said instruction execution pipeline is reduced.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus are disclosed for enhancing the pipeline instruction transfer and execution performance of a computer architecture by reducing instruction stalls due to branch and jump instructions. Trace cache within a computer architecture is used to receive computer instructions at a first rate and to store the computer instructions as traces of instructions. An instruction execution pipeline is also provided to receive, decode, and execute the computer instructions at a second rate that is less than the first rate. A mux is also provided between the trace cache and the instruction execution pipeline to select a next instruction to be loaded into the instruction execution pipeline from the trace cache based, in part, on a branch result fed back to the mux from the instruction execution pipeline.
34 Citations
20 Claims
-
1. In a computer architecture for executing computer instructions, a method to reduce the occurrence of instruction execution pipeline stalls due to branch instructions and jump instructions, said method comprising:
-
providing an instruction transfer bandwidth into a trace cache of said computer architecture that is greater than an instruction execution bandwidth of said computer architecture; utilizing said instruction transfer bandwidth to provide both the taken and not taken paths of a branch to the trace cache; feeding back branch results of executed branch instructions to control the transmission of correct next instructions into the execution pipeline; and using branch delay slots to postpone the execution of said next instruction during said transmission such that the occurrence of instruction execution stalls within said instruction execution pipeline is reduced. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. In a computer architecture for executing computer instructions, an apparatus to reduce the occurrence of instruction execution pipeline stalls due to branch instructions and jump instructions, said apparatus comprising:
-
a trace cache that receives computer instructions at a first rate and stores said computer instructions as traces of instructions, including both taken and not taken paths of branches; an instruction execution pipeline that receives, decodes, and executes computer instructions at a second rate that is less than said first rate; a selector that selects a next instruction to be loaded into said instruction execution pipeline from said trace cache based, in part, on a branch result fed back from said instruction execution pipeline as a control to said selector to transmit correct next instructions into the execution pipeline; and wherein said instruction execution pipeline uses branch delay slots to postpone the execution of said next instruction during said transmission such that the occurrence of instruction execution stalls within said instruction execution pipeline is reduced. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification