Effective elimination of delay slot handling from a front section of a processor pipeline
First Claim
1. A computing apparatus configured to execute an instruction set that includes at least one delayed control transfer type instruction (DCTI), the computing apparatus comprising:
- a pipeline front-end for fetching instructions from an instruction store without regard to an architecturally-defined special branching behavior for program sequences that include a second DCTI in a delay slot of a first DCTI, wherein the pipeline front-end lacks capability to determine proper execution order for nested DCTIs within fetched groups of instructions; and
a downstream pipeline section configured (i) to identify in a speculatively executed instruction sequence including at least one subsequence that is inconsistent with the architecturally-defined special branching behavior, the at least one subsequence containing the second DCTI in the delay slot of the first DCTI, and (ii) to enforce the architecturally-defined special branching behavior wherein said downstream pipeline section is located in the pipeline after an execution section.
3 Assignments
0 Petitions
Accused Products
Abstract
Architectural techniques and implementations that defer enforcement of certain delayed control transfer instruction (DCTI) sequencing constraints or conventions to later stages of an execution pipeline are described. In this way, complexity of a processor pipeline front-end (including fetch sequencing) can be simplified, at least in-part, by fetching instructions generally without regard to such constraints or conventions. Instead, enforcement of such sequencing constraints and/or conventions may be deferred to one or more pipeline stages associated with commitment or retirement of instructions. Higher fetch bandwidth may be achieved in some realizations when, for example, DCTI couples are encountered in an execution sequence.
-
Citations
19 Claims
-
1. A computing apparatus configured to execute an instruction set that includes at least one delayed control transfer type instruction (DCTI), the computing apparatus comprising:
-
a pipeline front-end for fetching instructions from an instruction store without regard to an architecturally-defined special branching behavior for program sequences that include a second DCTI in a delay slot of a first DCTI, wherein the pipeline front-end lacks capability to determine proper execution order for nested DCTIs within fetched groups of instructions; and a downstream pipeline section configured (i) to identify in a speculatively executed instruction sequence including at least one subsequence that is inconsistent with the architecturally-defined special branching behavior, the at least one subsequence containing the second DCTI in the delay slot of the first DCTI, and (ii) to enforce the architecturally-defined special branching behavior wherein said downstream pipeline section is located in the pipeline after an execution section. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A multi-core processor comprising:
-
plural implemented cores sharing a front-end section that fetches instructions, without regard to an architecturally-defined special branching behavior for program sequences that include a second DCTI in a delay slot of a first DCTI, to be executed in functional units of the cores, wherein the front-end section lacks capability to determine proper execution order for nested delayed control transfer type instructions (DCTIs) within fetched groups of instructions; and commit logic (i) to identify in a speculatively executed instruction sequence including at least one subsequence that is inconsistent with an instruction set defined special branching behavior, the at least one subsequence containing a second DCTI in a delay slot of a first DCTI, and (ii) to enforce the instruction set defined special branching behavior for the at least one subsequence rather than in fetch sequencing logic of the front-end section. - View Dependent Claims (14, 15, 16)
-
-
17. A method of operating a processor configured to execute an instruction set that includes at least one delayed control transfer type instruction (DCTI), the method comprising:
-
fetching instructions from an instruction store without regard to an architecturally-defined special branching behavior for program sequences that include a second DCTI in a delay slot of a first DCTI, wherein a pipeline front-end section that fetches the instructions lacks capability to determine proper execution order for nested DCTIs within fetched groups of instructions; speculatively executing instruction sequences including at least one subsequence that is inconsistent with the architecturally-defined special branching behavior; and identifying in the speculatively executed instruction sequences, a subsequence containing the second DCTI in the delay slot of the first DCTI and enforcing the architecturally-defined special branching behavior wherein said identifying and enforcement is performed in a pipeline section located in the pipeline after an execution section. - View Dependent Claims (18, 19)
-
Specification