Massively parallel array processor
First Claim
1. A processor array comprising:
- a plurality of processing elements (PEs) arranged in a multidimensional array, the PEs each comprising a plurality of input/output (I/O) ports, each of the I/O ports for transmitting and receiving data;
the PEs each coupled to four adjacent ones of the plurality of PES each at one of said plurality of I/O ports, and wherein each PE along an edge of the array is wraparound coupled to another PE of the plurality of PEs along a nonadjacent edge of the array, said four adjacent ones of the PEs are designated as north, south, east and west PEs;
wherein the PEs are coupled in a folded mesh such that pairs of the PEs share said plurality of input/output ports, a PE designated PEij shares the input/output ports with a PE designated PEji, where column and row subscripts i and j, respectively, are nonequal positive integers, PEs with i<
j are designated top PEs, PEs with i>
j are designated bottom PEs, PEs with i=j are designated diagonal PEs, and wherein the diagonal PEs are each coupled to two adjacent ones of the PEs each at one of said plurality of I/O ports, and each diagonal PE at a corner of the array is wraparound coupled to a nondiagonal PE at another corner of the array.
0 Assignments
0 Petitions
Accused Products
Abstract
Image processing for multimedia workstations is a computationally intensive task requiring special purpose hardware to meet the high speed requirements associated with the task. One type of specialized hardware that meets the computation high speed requirements is the mesh connected computer. Such a computer becomes a massively parallel machine when an array of computers interconnected by a network are replicated in a machine. The nearest neighbor mesh computer consists of an N×N square array of Processor Elements(PEs) where each PE is connected to the North, South, East and West PEs only. The diagonal folded mesh array processor, which is called Oracle, allows the matrix transformation operation to be accomplished in one cycle by simple interchange of the data elements in the dual symmetric processor elements. The use of Oracle for a parallel 2-D convolution mechanish for image processing and multimedia applications and for a finite difference method of solving differential equations is presented, concentrating on the computational aspects of the algorithm.
109 Citations
4 Claims
-
1. A processor array comprising:
-
a plurality of processing elements (PEs) arranged in a multidimensional array, the PEs each comprising a plurality of input/output (I/O) ports, each of the I/O ports for transmitting and receiving data;
the PEs each coupled to four adjacent ones of the plurality of PES each at one of said plurality of I/O ports, and wherein each PE along an edge of the array is wraparound coupled to another PE of the plurality of PEs along a nonadjacent edge of the array, said four adjacent ones of the PEs are designated as north, south, east and west PEs;
wherein the PEs are coupled in a folded mesh such that pairs of the PEs share said plurality of input/output ports, a PE designated PEij shares the input/output ports with a PE designated PEji, where column and row subscripts i and j, respectively, are nonequal positive integers, PEs with i<
j are designated top PEs,PEs with i>
j are designated bottom PEs,PEs with i=j are designated diagonal PEs, and wherein the diagonal PEs are each coupled to two adjacent ones of the PEs each at one of said plurality of I/O ports, and each diagonal PE at a corner of the array is wraparound coupled to a nondiagonal PE at another corner of the array. - View Dependent Claims (2, 3, 4)
a) transmit east/receive west mode for transmitting data to the east PE over one of said I/O ports while receiving data form the west PE over another of the I/O ports;
b) transmit north/receive south mode for transmitting data to the north PE over one of said I/O ports while receiving data from the south PE over a another of the I/O ports;
c) transmit south/receive north mode for transmitting data to the south PE over one of the I/O ports while receiving data from the north PE over another of the I/O ports; and
d) transmit west/receive east mode for transmitting data to the west PE over one of the I/O ports while receiving data from the east PE over another of the I/O ports.
-
-
3. The processor array according to claim 2 further comprising means of processing a square matrix algorithms, whereby the array is interconnected using half as many PE interconnections as a two dimensional square array.
-
4. The processor array according to claim 2, wherein said I/O ports each comprise a bit serial communication scheme.
Specification