Microprocessor optimized for algorithmic processing
First Claim
1. A processing unit comprising:
- a primary processor having an arithmetic logic unit, a data memory cache, one or more subprocessor control and status registers; and
a crossbar buss associated with the primary processor that interconnects the arithmetic logic unit to the data memory cache, the crossbar buss having a plurality of ports and being capable of providing multiple connection paths between respective selected sets of ports at the same time;
one or more subprocessors interconnected to the crossbar buss, each of the one or more subprocessors having a data memory store and an instruction memory store, the crossbar buss connected to the data memory store and to the instruction memory store.
1 Assignment
0 Petitions
Accused Products
Abstract
Provided is a microprocessor optimized for algorithmic processing for accelerating algorithm processing through a closely coupled set of parallel sub-processing elements. The device includes a primary processor, one or more subprocessors and an interconnecting buss. The buss is preferably a crossbar buss. The primary processor is preferably a pipelined CPU with additional logic to support algorithm processing. The crossbar buss allows the data memory to function as the data memory in the CPU, and provides paths to configure and initialize the algorithm subprocessors and to retrieve results from the subprocessors. The subprocessors are processing elements that execute segments of code on blocks of data. Preferably, the subprocessors are reconfigurable to optimize performance for the algorithm being executed.
45 Citations
26 Claims
-
1. A processing unit comprising:
-
a primary processor having an arithmetic logic unit, a data memory cache, one or more subprocessor control and status registers; and
a crossbar buss associated with the primary processor that interconnects the arithmetic logic unit to the data memory cache, the crossbar buss having a plurality of ports and being capable of providing multiple connection paths between respective selected sets of ports at the same time;
one or more subprocessors interconnected to the crossbar buss, each of the one or more subprocessors having a data memory store and an instruction memory store, the crossbar buss connected to the data memory store and to the instruction memory store. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A processing unit comprising:
-
a primary processor having an arithmetic logic unit and data memory cache;
one or more subprocessors;
one or more memory data stores, each of the memory data stores associated with at least one of the one or more subprocessors;
a buss connecting the arithmetic logic unit of the primary processor to the data memory cache of the primary processor and to the one or more memory data stores. - View Dependent Claims (11, 12, 13, 14, 15, 16)
-
-
17. A method of processing an algorithm on a multiple-processor system, the method comprising the steps:
-
connecting, with a crossbar buss, an arithmetic logic unit on a primary processor to a data cache on the primary processor;
connecting, with the crossbar buss, the arithmetic logic unit on the primary processor to a first data memory store associated with a first subprocessor;
loading data intended to be processed by the first subprocessor into the first data memory store;
connecting, with the crossbar buss, the arithmetic logic unit on the primary processor to a first instruction memory store associated with the first subprocessor;
loading instructions intended to be executed by the first subprocessor into the first instruction memory store;
connecting, with the crossbar buss, the arithmetic logic unit on the primary processor to a second data memory store associated with a second subprocessor;
loading data intended to be processed by the second subprocessor into the second data memory store;
connecting, with the crossbar buss, the arithmetic logic unit on the primary processor to a second instruction memory store associated with the second subprocessor;
loading instructions intended to be executed by the second subprocessor into the second instruction memory store. - View Dependent Claims (18, 19, 20, 21, 22, 23)
-
-
24. A circuit module comprising:
-
a processor packaged in a chipscale package, the processor having an arithmetic logic unit, one or more subprocessors, a data memory cache, one or more data memory stores associated with the one or subprocessors, and a crossbar buss associated with the processor and connecting the arithmetic logic unit to the data memory cache and the data memory stores;
flexible circuitry wrapped about the chipscale package to dispose a first portion of the flexible circuitry above the chipscale package and a second portion of the flexible circuitry below the chipscale package;
one or more semiconductor components mounted to the first portion of the flexible circuitry. - View Dependent Claims (25, 26)
-
Specification