EFFECTIVE ELIMINATION OF DELAY SLOT HANDLING FROM A FRONT SECTION OF A PROCESSOR PIPELINE
First Claim
1. A computing apparatus configured to execute an instruction set that includes at least one delayed control transfer type instruction (DCTI), the computing apparatus comprising:
- a pipeline front-end for fetching instructions from an instruction store generally without regard to an architecturally-defined special branching behavior for program sequences that include a second DCTI in the delay slot of a first DCTI; and
a downstream pipeline section configured to identify in a speculatively executed instruction sequence, a subsequence containing the second DCTI in the delay slot of the first DCTI and to enforce the architecturally-defined special branching behavior.
3 Assignments
0 Petitions
Accused Products
Abstract
Architectural techniques and implementations that defer enforcement of certain delayed control transfer instruction (DCTI) sequencing constraints or conventions to later stages of an execution pipeline are described. In this way, complexity of a processor pipeline front-end (including fetch sequencing) can be simplified, at least in-part, by fetching instructions generally without regard to such constraints or conventions. Instead, enforcement of such sequencing constraints and/or conventions may be deferred to one or more pipeline stages associated with commitment or retirement of instructions. Higher fetch bandwidth may be achieved in some realizations when, for example, DCTI couples are encountered in an execution sequence.
24 Citations
20 Claims
-
1. A computing apparatus configured to execute an instruction set that includes at least one delayed control transfer type instruction (DCTI), the computing apparatus comprising:
-
a pipeline front-end for fetching instructions from an instruction store generally without regard to an architecturally-defined special branching behavior for program sequences that include a second DCTI in the delay slot of a first DCTI; and a downstream pipeline section configured to identify in a speculatively executed instruction sequence, a subsequence containing the second DCTI in the delay slot of the first DCTI and to enforce the architecturally-defined special branching behavior. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
- 14. A multi-core processor in which plural of the implemented cores share a front-end section that fetches instructions to be executed in functional units of the cores and that enforces an instruction set defined special branching behavior for program sequences that include a second delayed control transfer instruction (DCTI) in the delay slot of a first DCTI in commit logic rather than in fetch sequencing logic of the front-end section.
-
18. A method of operating a processor configured to execute an instruction set that includes at least one delayed control transfer type instruction (DCTI), the method comprising:
-
fetching instructions from an instruction store generally without regard to an architecturally-defined special branching behavior for program sequences that include a second DCTI in the delay slot of a first DCTI; speculatively executing instruction sequences including at least some subsequences that are inconsistent with the architecturally-defined special branching behavior; and identifying in the speculatively executed instruction sequences, a subsequence containing the second DCTI in the delay slot of the first DCTI and enforcing the architecturally-defined special branching behavior. - View Dependent Claims (19, 20)
-
Specification