Mechanism for Efficient Implementation of Software Pipelined Loops in VLIW Processors
First Claim
1. A system to implement a zero overhead software pipelined (SFP) loop, said system comprising:
- a Very Long Instruction Word (VLIW) processor having a N number of execution slots, said VLIW processor executes a plurality of instructions in parallel without any limitation of an instruction buffer size;
a program memory that receives a Program Memory address to fetch an instruction packet, wherein said program memory is closely coupled with said instruction buffer size to implement said zero overhead software pipelined (SFP) loop, wherein the size of said zero overhead software pipelined (SFP) loop to exceed said instruction buffer size;
a CPU control registers comprising a block count and a iteration count, wherein said block count is loaded into a block counter and counts said plurality of instructions executed in the said zero overhead software pipelined (SFP) loop, and said iteration count is loaded into an iteration counter and counts a number of iterations of said zero overhead software pipelined (SFP) loop based on said block counter;
a loop instruction fetch logic that tracks at least one of a instructions of said plurality of instructions; and
a control logic that generates at least one of a control signals received by a instruction buffer, wherein said control signals are generated to execute said zero overhead software pipelined (SFP) loop.
1 Assignment
0 Petitions
Accused Products
Abstract
A system to implement a zero overhead software pipelined (SFP) loop includes a Very Long Instruction Word (VLIW) processor having an N number of execution slots. The VLIW processor executes a plurality of instructions in parallel without any limitation of an instruction buffer size. A program memory receives a Program Memory address to fetch an instruction packet. The program memory is closely coupled with the instruction buffer size to implement the zero overhead software pipelined (SFP) loop. The size of the zero overhead software pipelined (SFP) loop can exceed the instruction buffer size. A CPU control register includes a block count and an iteration count. The block count is loaded into a block counter and counts the plurality of instructions executed in the SFP loop, and the iteration count is loaded into an iteration counter and counts a number of iterations of the SFP loop based on the block count.
-
Citations
20 Claims
-
1. A system to implement a zero overhead software pipelined (SFP) loop, said system comprising:
-
a Very Long Instruction Word (VLIW) processor having a N number of execution slots, said VLIW processor executes a plurality of instructions in parallel without any limitation of an instruction buffer size; a program memory that receives a Program Memory address to fetch an instruction packet, wherein said program memory is closely coupled with said instruction buffer size to implement said zero overhead software pipelined (SFP) loop, wherein the size of said zero overhead software pipelined (SFP) loop to exceed said instruction buffer size; a CPU control registers comprising a block count and a iteration count, wherein said block count is loaded into a block counter and counts said plurality of instructions executed in the said zero overhead software pipelined (SFP) loop, and said iteration count is loaded into an iteration counter and counts a number of iterations of said zero overhead software pipelined (SFP) loop based on said block counter; a loop instruction fetch logic that tracks at least one of a instructions of said plurality of instructions; and a control logic that generates at least one of a control signals received by a instruction buffer, wherein said control signals are generated to execute said zero overhead software pipelined (SFP) loop. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method of implementing a short software pipelined (SFP) loop in a system, said system comprising:
-
a processor having a N number of execution slots, said processor executes a plurality of instructions in parallel without any limitation of an instruction buffer size; a program memory that receives a program memory address to fetch an instruction packet; a CPU control registers comprising a block count and a iteration count, wherein said block count is loaded into a block counter and counts said plurality of instructions executed in said short (SFP) loop, and said iteration count is loaded into a iteration counter and counts a number of iterations of said short SFP loop based on said block counter, said method comprising; determining if an instruction of said short SFP loop is encountered at the execution packet boundaries; storing a start address on said instruction being encountered; storing an iteration count in said iteration counter and said block count in said block counter; computing a last instruction address; and determining if said block count is greater than a maximum short block size, wherein said maximum short block size is equal to minimum depth of instruction buffer minus size of one fetch packet, wherein said short SFP loop is executed when said block count being lesser than said maximum short block size. - View Dependent Claims (7, 8, 9, 10, 11, 12, 13)
-
-
14. A method of implementing a long SFP loop in a system, said system comprising:
-
a processor having a N number of execution slots, said processor executes a plurality of instructions in parallel without any limitation of an instruction buffer size; a program memory that receives a program memory address to fetch an instruction packet, wherein said program memory is closely coupled with said instruction buffer size to implement said long SFP loop, wherein the size of said long SFP loop to exceed said instruction buffer size; a CPU control registers comprising a block count and a iteration count, wherein said block count is loaded into a block counter and counts said plurality of instructions executed in the a SFP loop, and said iteration count is loaded into the iteration counter and counts a number of iterations of said SFP loop based on said block counter, said method comprising; determining if an instruction of said long SFP loop is encountered at the execution packet boundaries; storing a start address on said instruction being encountered; storing an iteration count and an block count; computing a last instruction address; and determining if said block count is greater than a maximum short block size, wherein said long SFP loop is executed when said block count being greater than said maximum short block size. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
Specification