Split embedded DRAM processor
First Claim
1. A split very long instruction word (VLIW) processing apparatus comprising:
- a VLIW central processor comprising;
a set of functional units which receive a plurality of instructions for execution in parallel;
a first VLIW program cache which holds a collection of very long instruction words, each very long instruction word comprising a set of instruction fields, each instruction field comprising an instruction to be executed by a functional unit;
a dispatch unit which scans bit fields within said instruction fields to decide how many instructions to dispatch in parallel and to which functional unit to direct each instruction;
one or more register files coupled to said functional units;
an external memory interface which carries instructions and data from an external source; and
an on-board data memory coupled to said functional units, said register files, and said external memory interface,wherein;
at least one of said functional units includes a branch processing unit which processes branch instructions;
said branch processing unit is coupled to a prefetch unit used to sequence said VLIW control words from said VLIW program cache or external memory; and
said branch processing unit is coupled to an external interface for transferring branch related information;
a VLIW extension processor which cooperates with said VLIW central processor to jointly execute a single VLIW program, said VLIW extension processor comprising;
a set of at least one functional unit which receives one or more instructions for execution in a given clock cycle;
a second VLIW program cache which holds a collection of very long instruction words, whereby each very long instruction word comprises one or more instruction fields, wherein each instruction field comprises an instruction to be executed by a functional unit; and
a second dispatch unit which scans bit fields within said instruction fields to decide how many instructions to dispatch in parallel and to which functional unit to direct each instruction,wherein at least one of said functional units includes a second branch processing unit which processes branch instructions, said branch processing unit coupled to a prefetch unit which sequences VLIW control words from said second VLIW program cache, said branch processing unit coupled to a second external interface which transfers branch related information.
2 Assignments
0 Petitions
Accused Products
Abstract
A processing architecture includes a first CPU core portion coupled to a second embedded dynamic random access memory (DRAM) portion. These architectural components jointly implement a single processor and instruction set. Advantageously, the embedded logic on the DRAM chip implements the memory intensive processing tasks, thus reducing the amount of traffic that needs to be bussed back and forth between the CPU core and the embedded DRAM chips. The embedded DRAM logic monitors and manipulates the instruction stream into the CPU core. The architecture of the instruction set, data paths, addressing, control, caching, and interfaces are developed to allow the system to operate using a standard programming model. Specialized video and graphics processing systems are developed. Also, an extended very long instruction word (VLIW) architecture implemented as a primary VLIW processor coupled to an embedded DRAM VLIW extension processor efficiently deals with memory intensive tasks. In different embodiments, standard software can be accelerated either with or without the express knowledge of the processor.
246 Citations
10 Claims
-
1. A split very long instruction word (VLIW) processing apparatus comprising:
-
a VLIW central processor comprising; a set of functional units which receive a plurality of instructions for execution in parallel; a first VLIW program cache which holds a collection of very long instruction words, each very long instruction word comprising a set of instruction fields, each instruction field comprising an instruction to be executed by a functional unit; a dispatch unit which scans bit fields within said instruction fields to decide how many instructions to dispatch in parallel and to which functional unit to direct each instruction; one or more register files coupled to said functional units; an external memory interface which carries instructions and data from an external source; and an on-board data memory coupled to said functional units, said register files, and said external memory interface, wherein; at least one of said functional units includes a branch processing unit which processes branch instructions; said branch processing unit is coupled to a prefetch unit used to sequence said VLIW control words from said VLIW program cache or external memory; and said branch processing unit is coupled to an external interface for transferring branch related information; a VLIW extension processor which cooperates with said VLIW central processor to jointly execute a single VLIW program, said VLIW extension processor comprising; a set of at least one functional unit which receives one or more instructions for execution in a given clock cycle; a second VLIW program cache which holds a collection of very long instruction words, whereby each very long instruction word comprises one or more instruction fields, wherein each instruction field comprises an instruction to be executed by a functional unit; and a second dispatch unit which scans bit fields within said instruction fields to decide how many instructions to dispatch in parallel and to which functional unit to direct each instruction, wherein at least one of said functional units includes a second branch processing unit which processes branch instructions, said branch processing unit coupled to a prefetch unit which sequences VLIW control words from said second VLIW program cache, said branch processing unit coupled to a second external interface which transfers branch related information. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
Specification