Microprocessor with conditional cross path stall to minimize CPU cycle time length
First Claim
1. A digital system comprising a central processing unit (CPU) having an instruction execution pipeline with a plurality of functional units for executing instructions in a sequence of CPU cycles, the CPU comprising:
- a first functional unit interconnected with a first set of registers, the first functional unit operable to exchange operand data with the first set of registers;
a second functional unit interconnected with a second set of registers, the second functional unit operable to exchange operand data with the second set of registers, wherein a write to said first and second sets of registers is performed over multiple pipeline cycles;
first cross-path circuitry connected to an input of the second functional unit and to a port on the first set of registers, the cross-path circuitry being operable to access the first set of registers for providing operand data from the first set of registers to the second functional unit;
wherein the cross-path circuitry is operable to stall both the first functional unit and the second functional unit in response to accessing a selected register in the first register set during a given CPU cycle if the selected register is being updated by the first functional unit;
wherein the first cross-path circuitry comprises a first stall register connected to receive operand data from the first functional unit in parallel with the first set of registers; and
wherein to minimize CPU cycle time length, the first cross-path circuitry is operable to provide a first operand from the first stall register during a given CPU cycle if the first operand was being stored into the selected register of the first register set during the mediately prior CPU cycle, such that the instruction execution pipeline is stalled for one CPU cycle when the first operand is provided from the first stall register.
1 Assignment
0 Petitions
Accused Products
Abstract
A digital system is provided that includes a central processing unit (CPU) that has an instruction execution pipeline with a plurality of functional units for executing instructions in a sequence of CPU cycles. The execution units are clustered into two or more groups. Cross-path circuitry is provided such that results from any execution unit in one execution unit cluster can be supplied to execution units in another cluster. A cross-path stall is conditionally inserted to stall all of the functional groups when one execution unit cluster requires an operand from another cluster on a given CPU cycle and the execution unit that is producing that operand completes the computation of that operand on an immediately preceding CPU cycle.
22 Citations
11 Claims
-
1. A digital system comprising a central processing unit (CPU) having an instruction execution pipeline with a plurality of functional units for executing instructions in a sequence of CPU cycles, the CPU comprising:
-
a first functional unit interconnected with a first set of registers, the first functional unit operable to exchange operand data with the first set of registers;
a second functional unit interconnected with a second set of registers, the second functional unit operable to exchange operand data with the second set of registers, wherein a write to said first and second sets of registers is performed over multiple pipeline cycles;
first cross-path circuitry connected to an input of the second functional unit and to a port on the first set of registers, the cross-path circuitry being operable to access the first set of registers for providing operand data from the first set of registers to the second functional unit;
wherein the cross-path circuitry is operable to stall both the first functional unit and the second functional unit in response to accessing a selected register in the first register set during a given CPU cycle if the selected register is being updated by the first functional unit;
wherein the first cross-path circuitry comprises a first stall register connected to receive operand data from the first functional unit in parallel with the first set of registers; and
wherein to minimize CPU cycle time length, the first cross-path circuitry is operable to provide a first operand from the first stall register during a given CPU cycle if the first operand was being stored into the selected register of the first register set during the mediately prior CPU cycle, such that the instruction execution pipeline is stalled for one CPU cycle when the first operand is provided from the first stall register. - View Dependent Claims (2, 3, 4, 5, 6, 7)
a fist plurality of functional units interconnected with the first set of registers; and
wherein the first cross-path circuitry further comprises a first plurality of stall registers connected respectively to the first plurality of functional units to receive operand data from each of the first plurality of functional units in parallel with the first set of registers.
-
-
4. The CPU of claim 1, further comprising:
-
a first plurality of functional units interconnected with the first set of registers; and
wherein the cross-path circuitry further comprises multiplexer circuitry having an output connected to the stall register, with a plurality of inputs connected respectively to the first plurality of functional units.
-
-
5. The digital of claim 1, wherein the CPU is a very long instruction word (VLIW) CPU, further comprising an instruction memory and a data memory.
-
6. The digital system of claim 5, wherein the CPU is a digital signal processor and wherein the first functional unit and the second functional unit are multiply-accumulate units.
-
7. The digital system of claim 1 being a cellular telephone, further comprising:
-
an integrated keyboard connected to the CPU via a keyboard adapter;
a display, connected to the CPU via a display adapter;
radio frequency (R) circuitry connected to the CPU; and
an aerial connected to the RF circuitry.
-
-
8. A method of operating a CPU having an instruction execution pipeline with a plurality of functional units for executing instructions in a sequence of CPU cycles, the method comprising the steps of:
-
exchanging operands between a first functional unit and a first set of registers associated with the first functional unit;
exchanging operands between a second functional unit and a second set of registers associated with the second functional unit, wherein a write to said first and second sets of registers is performed over multiple pipeline cycles;
storing a operand in a stall resister in parallel with a selected register of the first set of registers if the selected register was undated an immediately prior CPU cycle; and
accessing the operand from the selected register of the first set of registers for use by the second functional unit during a given CPU cycle, wherein the step of accessing comprises the steps of;
determining if the selected register was updated during a CPU cycle immediately prior to the given CPU cycle;
stalling both the first functional unit and the second functional unit in response to accessing the selected register in the first register set during the given CPU cycle if the selected register is updated by the first functional unit during the given CPU cycle; and
obtaining the operand from the stall register if the selected register was updated during the immediately prior CPU cycle, wherein the instruction execution pipeline is stalled for one CPU cycle when the operand is obtained from the stall register to minimize CPU cycle time length. - View Dependent Claims (9)
obtaining the operand directly from the selected register if the selected register was not updated during the immediately prior CPU cycle.
-
-
10. A method of operating a CPU having an instruction execution pipeline with a plurality of functional units for executing instructions in a sequence of CPU cycles, the method comprising the steps of:
-
exchanging operands between a first functional unit and a first set of registers associated with the first functional unit;
exchanging operands between a second functional unit and a second set of registers associated with the second functional unit, wherein a write to said first and second sets of registers is performed over multiple pipeline cycles;
storing a plurality of operands in a plurality of stall registers in parallel with selected registers of the first set of registers; and
accessing an operand from one of the selected registers of the first set of registers for use by the second functional unit during a given CPU cycle, wherein the step of accessing comprises the steps of;
determining if the one of the selected registers was updated during a CPU cycle immediately prior to the given CPU cycle;
stalling both the first functional unit and the second functional unit in response to accessing the one of the selected registers in the first register set during the given CPU cycle if the one of the selected registers is updated by the first functional unit during the given CPU cycle; and
obtaining one of the plurality of operands from a stall register selected from the plurality of stall registers if the one of the selected registers was updated during the immediately prior CPU cycle, wherein the instruction execution pipeline is stalled for one CPU cycle when the operand is obtained from the stall register to minimize CPU cycle time length. - View Dependent Claims (11)
obtaining the operand directly from the one of the selected registers if the one of the selected registers was not updated during the immediately prior CPU cycle.
-
Specification