Processor array and parallel data processing methods
First Claim
1. A processor array comprising a plurality of interconnected processor elements, a plurality of instruction buses connected to each of the processor elements at least one data bus connected to each of the processor elements and an instruction selection switch associated with each of the processor elements, each processor element connected to execute instructions from a one of the plurality of instruction buses selected by its instruction selection switch wherein a ratio of the number of processor elements in the processor array to the number of instruction buses in the processor array is greater than 100:
- 1.
6 Assignments
0 Petitions
Accused Products
Abstract
An array of processor elements has multiple instruction streams and multiple data streams broadcast to all of the processor elements. The processor elements are each connected to multiple neighbouring processor elements within a cruciate neighbourhood. The architecture is suitable for use in fine-grained applications. The array may have a processor element for each pixel of an image. The array is preferably provided on a single integrated circuit having 10,000 or more processor elements.
189 Citations
34 Claims
- 1. A processor array comprising a plurality of interconnected processor elements, a plurality of instruction buses connected to each of the processor elements at least one data bus connected to each of the processor elements and an instruction selection switch associated with each of the processor elements, each processor element connected to execute instructions from a one of the plurality of instruction buses selected by its instruction selection switch wherein a ratio of the number of processor elements in the processor array to the number of instruction buses in the processor array is greater than 100:
- 14. A processor array comprising a plurality of interconnected processor elements, plurality of instruction buses connected to each of the processor elements, at least one data bus connected to each of the processor element and an instruction selection switch associated with each of the processor elements, each processor element connected to execute instructions from a one of the plurality of instruction buses selected by its instruction selection switch, wherein each of the processor elements is connected to send data to other processor elements in a cruciate neighbourhood, each of the processor elements comprises a local register and the processor element is connected to broadcast data in the local register simultaneously to other processor elements in a the cruciate neighbourhood.
-
16. A Processor array comprising a plurality of interconnected processor elements, a plurality of instruction buses connected to each of the processor elements, at least one data bus connected to each of the processor elements and an instruction selection switch associated with each of the processor elements, each processor element connected to execute instructions from a one of the plurality of instruction buses selected by its instruction selection switch, wherein each of the processor elements is connected to send data to other processor elements in a cruciate neighbourhood, and each of the processor elements comprises a register and selection logic the selection logic configured to receive data from a particular one of the other processor elements in the cruciate neighbourhood as determined by the value in the register.
-
17. A processor array comprising a plurality of interconnected processor elements, a plurality of instruction buses connected to each of the processor elements, at least one data bus connected to each of the processor elements and an instruction selection switch associated with each of the processor elements, each processor element connected to execute instructions from a one of the plurality of instruction buses selected by its instruction selection switch, wherein each of the processor elements is connected to send data to other processor elements in a cruciate neighbourhood and wherein the cruciate neighbourhoods each comprise four arms radiating from a processor element and each arm comprises at least two processor elements.
-
18. A processor array comprising a plurality of interconnected processor elements, a plurality of instruction buses connected to each of the processor elements, at least one data bus connected to each of the processor elements and an instruction selection switch associated with each of the processor elements, each processor element connected to execute instructions from a one of the plurality of Instruction buses selected by its instruction selection switch, wherein the processor elements are arranged in a plurality of rows and a plurality of columns and each of the processor elements has direct data connections only to neighboring processor elements in the same row or column as the processor element and each processor element has direct data connections to a plurality of neighbouring processor elements on each side of the processor element in the same row as the processor element and a plurality of neighbouring processor elements on each side of the processor element in the same column as the processor element.
-
19. A processor array comprising a plurality of interconnected processor elements, each of the processor elements logically arranged at an intersection of a row and a column in a grid comprising a plurality of rows and a plurality of columns, each of the processor elements connected to transmit data to other processor elements in a neighborhood comprising a plurality of neighbouring processor elements, the plurality neighbouring processor elements comprising a number N>
- 1 of processor elements in the column on either side of the processor element and a number M>
1 of processor elements in the row on either side of the processor element wherein one or more instruction buses are connected to deliver a plurality of instruction streams from an instruction source to each of the processor elements, one or more data buses are connected to deliver at least one data stream from a data source to each of the processor elements and one or more clock buses are connected to deliver a clock signal from a clock to each of the processor elements, wherein, for each of the processor elements, propagation times to the processor element from the data source on the one or more data buses, from the instruction source on the one or more instruction buses and from the clock on the one or more clock buses are substantially the same.
- 1 of processor elements in the column on either side of the processor element and a number M>
-
20. A processor array comprising a plurality of interconnected processor elements, each of the processor elements logically arranged at an intersection of a row and a column in a grid comprising a plurality of rows and a plurality of columns, each of the processor elements connected to transmit data to other processor elements in a neighborhood comprising a plurality of neighbouring processor elements, the plurality of neighbouring processor elements comprising a number N>
- 1 of processor element in the column on either side of the processor element and a number M>
1 of processor elements in the row on either side of the processor element wherein each of the processor elements comprise an i/o register and the array comprises a set of read registers, the read registers comprising one read register for each of the columns, a first i/o data line connecting each i/o register to a corresponding read register; and
, row select logic connected to select all of the processor elements in one of the rows, wherein, when one of the rows is selected, data from i/o registers of processor elements in the selected row is written to the corresponding read registers by way of the first i/o data lines. - View Dependent Claims (21, 22, 23, 24)
- 1 of processor element in the column on either side of the processor element and a number M>
-
25. A processor array comprising a plurality of interconnected processor elements, each of the processor elements logically arranged at an intersection of a row and a column in a grid comprising a plurality of rows and a plurality of columns, each of the processor elements connected to transmit data to other processor elements in a neighborhood comprising a plurality of neighbouring processor elements, the plurality of neighbouring processor elements comprising a number N>
- 1 of processor elements in the column on either side of the processor element and a number M>
1 of processor elements in the row on either side of the processor element wherein each of the processor elements comprises means for simultaneously broadcasting the contents of a local register to all other processor elements in we neighbourhood. - View Dependent Claims (26, 27, 28, 29, 30)
- 1 of processor elements in the column on either side of the processor element and a number M>
-
31. A processor array comprising a plurality of interconnected processor elements, each of the processor elements logically arranged at an intersection of a row and a column in a grid comprising a plurality of rows and a plurality of columns, each of the processor elements connected to transmit data to other processor elements in a neighborhood comprising a plurality of neighbouring processor elements, the plurality of neighbouring processor elements comprising a number N>
- 1 of processor elements in the column on either side of the processor element and a number M>
1 of processor elements in the row on either side of the processor element comprising a plurality of read registers, one read register corresponding to each of the columns, means for selecting one of the rows and means for simultaneously transferring data from each one of the processor elements in a selected row into a corresponding read register.
- 1 of processor elements in the column on either side of the processor element and a number M>
-
32. A method for operating processor array comprising a plurality of processor elements, each of the processor elements comprising a plurality of registers, each of the plurality of registers in each of the processor elements comprising registers which require dynamic refreshing at a refresh frequency, the method comprising:
-
a) providing one or more streams of instructions to each of the processor elements for execution by the processor elements; and
,b) periodically inserting into the one or more iron streams register refresh instructions, the register refresh instructions causing the processor elements to rewrite data values in the registers.
-
-
33. A method for operating a processor array having a plurality of interconnected processor elements, the method comprising:
-
a) providing an array of processor elements, each of the processor elements logically arranged at an intersection of a row and a column in a grid comprising a plurality of rows and a plurality of columns, each of the processor elements connected to transmit data to a plurality of neighbouring processor elements, the plurality of neighbouring processor elements comprising a number N, with N>
1 of processor elements in the column on each side of the processor element and a number M, with M>
1, of processor elements in the row on each side of the processor element;
b) determining when one or more of the processor elements is defective; and
,c) for each defective one of the processor elements, ignoring either the row or column containing the defective one of the processor elements.
-
-
34. A method for implementing a table lookup operation in a processor array, the method comprising:
-
a) providing a processor array comprising a plurality of processor elements;
b) providing multiple data streams to each processor element;
c) providing a lookup table comprising several parts each part corresponding to a range of values, each of the parts comprising one or more table values;
d) simultaneously transmitting the several parts of the lookup table on the multiple data streams;
e) at each processor element selecting a data stream to access as a function of a data value in the processor element; and
,f) at each processor element retrieving from the selected data stream a table value corresponding to the data value of the processor element.
-
Specification