SIMD/MIMD inter-processor communication
First Claim
1. An array processing system, comprising:
- a plurality of processing elements interconnected as an array processor, each having a processor and a memory coupled to said processor, and wherein each of the processing elements selectively and automatically executes an independent instruction stream on an independent multiple data stream, thereby providing for an MIMD mode;
a control processor that dispatches a series of single instructions to the plurality of processing elements, each of the single instructions operative to command the respective processing elements to execute respective multiple independent instruction streams on multiple independent data streams located one per processing element, each successive instruction of said single instructions being dispatched by said control processor in response to all of said processing elements accessing an instruction immediately preceding said each successive instruction;
wherein a first one of said processing elements which has completed execution of a multiple instruction stream in response to an instruction of said single instructions accesses and begins executing an immediately subsequent instruction of said single instructions after all other processing elements have read said instruction and before all other processing elements complete execution of respective multiple instruction streams in response to said instruction, whereby the processing elements execute the series of single instructions independently of a fixed time relationship between or among the processing elements with respect to accessing a subsequent single instruction before all processing elements have completed executing multiple instructions in response to a single instruction immediately precedent to said subsequent single instruction; and
further comprising an interconnection network for interconnecting said plurality of processing elements, wherein interprocessor communication includes replication.
0 Assignments
0 Petitions
Accused Products
Abstract
A parallel array processor for massively parallel applications is formed with low power CMOS with DRAM processing while incorporating processing elements on a single chip. Eight processors on a single chip have their own associated processing element, significant memory, and I/O and are interconnected with a hypercube based, but modified, topology. These nodes are then interconnected, either by a hypercube, modified hypercube, ring, or ring within ring network topology. Conventional microprocessor MMPs consume pins and time going to memory. The new architecture merges processor and memory with multiple PMEs (eight 16 bit processors with 32K and I/O) in DRAM and has no memory access delays and uses all the pins for networking. The chip can be a single node of a fine-grained parallel processor. Each chip will have eight 16 bit processors, each processor providing 5 MIPs performance. I/O has three internal ports and one external port shared by the plural processors on the chip. The scalable chip PME has internal and external connections for broadcast and asynchronous SIMD, MIMD and SIMIMD (SIMD/MIMD) with dynamic switching of modes. The chip can be used in systems which employ 32, 64 or 128,000 processors. Local and global memory functions can all be provided by the chips themselves, and the system can connect to and support other global memories and DASD. The chip can be used as a microprocessor accelerator, in personal computer applications, as a vision or avionics computer system, or as workstation or supercomputer.
205 Citations
18 Claims
-
1. An array processing system, comprising:
-
a plurality of processing elements interconnected as an array processor, each having a processor and a memory coupled to said processor, and wherein each of the processing elements selectively and automatically executes an independent instruction stream on an independent multiple data stream, thereby providing for an MIMD mode; a control processor that dispatches a series of single instructions to the plurality of processing elements, each of the single instructions operative to command the respective processing elements to execute respective multiple independent instruction streams on multiple independent data streams located one per processing element, each successive instruction of said single instructions being dispatched by said control processor in response to all of said processing elements accessing an instruction immediately preceding said each successive instruction; wherein a first one of said processing elements which has completed execution of a multiple instruction stream in response to an instruction of said single instructions accesses and begins executing an immediately subsequent instruction of said single instructions after all other processing elements have read said instruction and before all other processing elements complete execution of respective multiple instruction streams in response to said instruction, whereby the processing elements execute the series of single instructions independently of a fixed time relationship between or among the processing elements with respect to accessing a subsequent single instruction before all processing elements have completed executing multiple instructions in response to a single instruction immediately precedent to said subsequent single instruction; and
further comprising an interconnection network for interconnecting said plurality of processing elements, wherein interprocessor communication includes replication. - View Dependent Claims (2)
-
-
3. An array processing system, comprising:
-
a plurality of processing elements interconnected as an array processor, each having a processor and a memory coupled to said processor, and wherein each of the processing elements selectively and automatically executes an independent instruction stream on an independent multiple data stream, thereby providing for an MIMD mode; a control processor that dispatches a series of single instructions to the plurality of processing elements, each of the single instructions operative to command the respective processing elements to execute respective multiple independent instruction streams on multiple independent data streams located one per processing element, each successive instruction of said single instructions being dispatched by said control processor in response to all of said processing elements accessing an instruction immediately preceding said each successive instruction; wherein a first one of said processing elements which has completed execution of a multiple instruction stream in response to an instruction of said single instructions accesses and begins executing an immediately subsequent instruction of said single instructions after all other processing elements have read said instruction and before all other processing elements complete execution of respective multiple instruction streams in response to said instruction, whereby the processing elements execute the series of single instructions independently of a fixed time relationship between or among the processing elements with respect to accessing a subsequent single instruction before all processing elements have completed executing multiple instructions in response to a single instruction immediately precedent to said subsequent single instruction; and
wherein is provided reduction used with interprocessor communication wherein reduction involves combining a first number of data values into a second number of resultant data values, wherein said second number is less than said first number. - View Dependent Claims (4)
-
-
5. An array processing system, comprising:
-
a plurality of processing elements interconnected as an array processor, each having a processor and a memory coupled to said processor, and wherein each of the processing elements selectively and automatically executes an independent instruction stream on an independent multiple data stream, thereby providing for an MIMD mode; a control processor that dispatches a series of single instructions to the plurality of processing elements, each of the single instructions operative to command the respective processing elements to execute respective multiple independent instruction streams on multiple independent data streams located one per processing element, each successive instruction of said single instructions being dispatched by said control processor in response to all of said processing elements accessing an instruction immediately preceding said each successive instruction; wherein a first one of said processing elements which has completed execution of a multiple instruction stream in response to an instruction of said single instructions accesses and begins executing an immediately subsequent instruction of said single instructions after all other processing elements have read said instruction and before all other processing elements complete execution of respective multiple instruction streams in response to said instruction, whereby the processing elements execute the series of single instructions independently of a fixed time relationship between or among the processing elements with respect to accessing a subsequent single instruction before all processing elements have completed executing multiple instructions in response to a single instruction immediately precedent to said subsequent single instruction; and
wherein interprocessor communication includes permutation which rearranges the location of a number of data elements relative to said processing elements and preserves the number of data elements.
-
-
6. An array processing system, comprising:
-
a plurality of processing elements interconnected as an array processor, each having a processor and a memory coupled to said processor, and wherein each of the processing elements selectively and automatically executes an independent instruction stream on an independent multiple data stream, thereby providing for an MIMD mode; a control processor that dispatches a series of single instructions to the plurality of processing elements, each of the single instructions operative to command the respective processing elements to execute respective multiple independent instruction streams on multiple independent data streams located one per processing element, each successive instruction of said single instructions being dispatched by said control processor in response to all of said processing elements accessing an instruction immediately preceding said each successive instruction; wherein a first one of said processing elements which has completed execution of a multiple instruction stream in response to an instruction of said single instructions accesses and begins executing an immediately subsequent instruction of said single instructions after all other processing elements have read said instruction and before all other processing elements complete execution of respective multiple instruction streams in response to said instruction, whereby the processing elements execute the series of single instructions independently of a fixed time relationship between or among the processing elements with respect to accessing a subsequent single instruction before all processing elements have completed executing multiple instructions in response to a single instruction immediately precedent to said subsequent single instruction; and
wherein is provided global communication functions which include broadcasting data or instruction from a host to a node, said node comprising at least on processing element, reducing data from a node to a host, reducing data to all the nodes, preforming scans across a node, performing segmented parallel prefix operations, concatenation of elements into a buffer on all nodes or concatenation of elements from the nodes to a buffer on the host.
-
-
7. An array processing system, comprising:
-
a plurality of processing elements interconnected as an array processor, each having a processor and a memory coupled to said processor, and wherein each of the processing elements selectively and automatically executes an independent instruction stream on an independent multiple data stream, thereby providing for an MIMD mode; a control processor that dispatches a series of single instructions to the plurality of processing elements, each of the single instructions operative to command the respective processing elements to execute respective multiple independent instruction streams on multiple independent data streams located one per processing element, each successive instruction of said single instructions being dispatched by said control processor in response to all of said processing elements accessing an instruction immediately preceding said each successive instruction; wherein a first one of said processing elements which has completed execution of a multiple instruction stream in response to an instruction of said single instructions accesses and begins executing an immediately subsequent instruction of said single instructions after all other processing elements have read said instruction and before all other processing elements complete execution of respective multiple instruction streams in response to said instruction, whereby the processing elements execute the series of single instructions independently of a fixed time relationship between or among the processing elements with respect to accessing a subsequent single instruction before all processing elements have completed executing multiple instructions in response to a single instruction immediately precedent to said subsequent single instruction; and
further comprising an interconnection network for interconnecting said plurality of processing elements, wherein interconnection network communication includes reduction and parallel prefix operations for performing summation, finding a maximum or minimum value, or performing AND, OR, or XOR.
-
-
8. An array processing system, comprising:
-
a plurality of processing elements interconnected as an array processor, each having a processor and a memory coupled to said processor, and wherein each of the processing elements selectively and automatically executes an independent instruction stream on an independent multiple data stream, thereby providing for an MIMD mode; a control processor that dispatches a series of single instructions to the plurality of processing elements, each of the single instructions operative to command the respective processing elements to execute respective multiple independent instruction streams on multiple independent data streams located one per processing element, each successive instruction of said single instructions being dispatched by said control processor in response to all of said processing elements accessing an instruction immediately preceding said each successive instruction; wherein a first one of said processing elements which has completed execution of a multiple instruction stream in response to an instruction of said single instructions accesses and begins executing an immediately subsequent instruction of said single instructions after all other processing elements have read said instruction and before all other processing elements complete execution of respective multiple instruction streams in response to said instruction, whereby the processing elements execute the series of single instructions independently of a fixed time relationship between or among the processing elements with respect to accessing a subsequent single instruction before all processing elements have completed executing multiple instructions in response to a single instruction immediately precedent to said subsequent single instruction; and
including reduce and parallel prefix operations for performing summation find a maximum or minimum value operations, or perform bitwise AND, OR or XOR operations, and wherein these operations are performed at a node level or at an individual processing element level, wherein a node includes a plurality of interconnected processing elements forming a fundamental topological unit of said array processing system.
-
-
9. An array processing system, comprising:
-
a plurality of processing elements interconnected as an array processor, each having a processor and a memory coupled to said processor, and wherein each of the processing elements selectively and automatically executes an independent instruction stream on an independent multiple data stream, thereby providing for an MIMD mode; a control processor that dispatches a series of single instructions to the plurality of processing elements, each of the single instructions operative to command the respective processing elements to execute respective multiple independent instruction streams on multiple independent data streams located one per processing element, each successive instruction of said single instructions being dispatched by said control processor in response to all of said processing elements accessing an instruction immediately preceding said each successive instruction; wherein a first one of said processing elements which has completed execution of a multiple instruction stream in response to an instruction of said single instructions accesses and begins executing an immediately subsequent instruction of said single instructions after all other processing elements have read said instruction and before all other processing elements complete execution of respective multiple instruction streams in response to said instruction, whereby the processing elements execute the series of single instructions independently of a fixed time relationship between or among the processing elements with respect to accessing a subsequent single instruction before all processing elements have completed executing multiple instructions in response to a single instruction immediately precedent to said subsequent single instruction; and
wherein global operations are selectively performed, including global reduction, interger summation, finding an integer max, logical OR, logical exclusive OR, matrix operations, and floating point operations, said global operations are executed by a node or by an individual processing element.
-
-
10. An array processing system, comprising:
-
a plurality of processing elements interconnected as an array processor, each having a processor and a memory coupled to said processor, and wherein each of the processing elements selectively and automatically executes an independent instruction stream on an independent multiple data stream, thereby providing for an MIMD mode; a control processor that dispatches a series of single instructions to the plurality of processing elements, each of the single instructions operative to command the respective processing elements to execute respective multiple independent instruction streams on multiple independent data streams located one per processing element, each successive instruction of said single instructions being dispatched by said control processor in response to all of said processing elements accessing an instruction immediately preceding said each successive instruction; wherein a first one of said processing elements which has completed execution of a multiple instruction stream in response to an instruction of said single instructions accesses and begins executing an immediately subsequent instruction of said single instructions after all other processing elements have read said instruction and before all other processing elements complete execution of respective multiple instruction streams in response to said instruction, whereby the processing elements execute the series of single instructions independently of a fixed time relationship between or among the processing elements with respect to accessing a subsequent single instruction before all processing elements have completed executing multiple instructions in response to a single instruction immediately precedent to said subsequent single instruction; and
further comprising an interconnection network for interconnecting said plurality of processing elements, wherein said control processor and interconnection network cooperate to selectively provide functions including synchronizing processing elements or nodes, combining a value from every processing element to produce a single result, and computing parallel prefix operations.
-
-
11. An array processing system, comprising:
-
a plurality of processing elements interconnected as an array processor, each having a processor and a memory coupled to said processor, and wherein each of the processing elements selectively and automatically executes an independent instruction stream on an independent multiple data stream thereby providing for an MIMD mode; a control processor that dispatches a series of single instructions to the plurality of processing elements, each of the single instructions operative to command the respective processing elements to execute respective multiple independent instruction streams on multiple independent data streams located one per processing element, each successive instruction of said single instructions being dispatched by said control processor in response to all of said processing elements accessing an instruction immediately preceding said each successive instruction; wherein a first one of said processing elements which has completed execution of a multiple instruction stream in response to an instruction of said single instructions accesses and begins executing an immediately subsequent instruction of said single instructions after all other processing elements have read said instruction and before all other processing elements complete execution of respective multiple instruction streams in response to said instruction, whereby the processing elements execute the series of single instructions independently of a fixed time relationship between or among the processing elements with respect to accessing a subsequent single instruction before all processing elements have completed executing multiple instructions in response to a single instruction immediately precedent to said subsequent single instruction; and
further comprising an interconnection network for interconnecting said plurality of processing elements and said control processor, wherein said interconnection network includes a separate control network and data network.
-
-
12. An array processing system, comprising:
-
a plurality of processing elements interconnected as an array processor, each having a processor and a memory coupled to said processor, and wherein each of the processing elements selectively and automatically executes an independent instruction stream on an independent multiple data stream, thereby providing for an MIMD mode; a control processor that dispatches a series of single instructions to the plurality of processing elements, each of the single instructions operative to command the respective processing elements to execute respective multiple independent instruction streams on multiple independent data streams located one per processing element, each successive instruction of said single instructions being dispatched by said control processor in response to all of said processing elements accessing an instruction immediately preceding said each successive instruction; wherein a first one of said processing elements which has completed execution of a multiple instruction stream in response to an instruction of said single instructions accesses and begins executing an immediately subsequent instruction of said single instructions after all other processing elements have read said instruction and before all other processing elements complete execution of respective multiple instruction streams in response to said instruction, whereby the processing elements execute the series of single instructions independently of a fixed time relationship between or among the processing elements with respect to accessing a subsequent single instruction before all processing elements have completed executing multiple instructions in response to a single instruction immediately precedent to said subsequent single instruction; and
wherein replication is performed within a software defined array of processing elements or nodes.
-
-
13. An array processing system, comprising:
-
a plurality of processing elements interconnected as an array processor, each having a processor and a memory coupled to said processor, and wherein each of the processing elements selectively and automatically executes an independent instruction stream on an independent multiple data stream, thereby providing for an MIMD mode; a control processor that dispatches a series of single instructions to the plurality of processing elements, each of the single instructions operative to command the respective processing elements to execute respective multiple independent instruction streams on multiple independent data streams located one per processing element, each successive instruction of said single instructions being dispatched by said control processor in response to all of said processing elements accessing an instruction immediately preceding said each successive instruction; wherein a first one of said processing elements which has completed execution of a multiple instruction stream in response to an instruction of said single instructions accesses and begins executing an immediately subsequent instruction of said single instructions after all other processing elements have read said instruction and before all other processing elements complete execution of respective multiple instruction streams in response to said instruction, whereby the processing elements execute the series of single instructions independently of a fixed time relationship between or among the processing elements with respect to accessing a subsequent single instruction before all processing elements have completed executing multiple instructions in response to a single instruction immediately precedent to said subsequent single instruction; and
wherein a broadcast to an array includes broadcasting to all processing elements defined for an array process, and includes spreading and the opposition of spread, and wherein the processing elements can be logically partitioned into clusters for broadcast operations.
-
-
14. An array processing system, comprising:
-
a plurality of processing elements interconnected as an array processor, each having a processor and a memory coupled to said processor, and wherein each of the processing elements selectively and automatically executes an independent instruction stream on an independent multiple data stream, thereby providing for an MIMD mode; a control processor that dispatches a series of single instructions to the plurality of processing elements, each of the single instructions operative to command the respective processing elements to execute respective multiple independent instruction streams on multiple independent data streams located one per processing element, each successive instruction of said single instructions being dispatched by said control processor in response to all of said processing elements accessing an instruction immediately preceding said each successive instruction; wherein a first one of said processing elements which has completed execution of a multiple instruction stream in response to an instruction of said single instructions accesses and begins executing an immediately subsequent instruction of said single instructions after all other processing elements have read said instruction and before all other processing elements complete execution of respective multiple instruction streams in response to said instruction, whereby the processing elements execute the series of single instructions independently of a fixed time relationship between or among the processing elements with respect to accessing a subsequent single instruction before all processing elements have completed executing multiple instructions in response to a single instruction immediately precedent to said subsequent single instruction; and
wherein each processing element has means for broadcasting out from a node or from another processing element within the node.
-
-
15. An array processing system, comprising:
-
a plurality of processing elements interconnected as an array processor, each having a processor and a memory coupled to said processor, and wherein each of the processing elements selectively and automatically executes an independent instruction stream on an independent multiple data stream, thereby providing for an MIMD mode; a control processor that dispatches a series of single instructions to the plurality of processing elements, each of the single instructions operative to command the respective processing elements to execute respective multiple independent instruction streams on multiple independent data streams located one per processing element, each successive instruction of said single instructions being dispatched by said control processor in response to all of said processing elements accessing an instruction immediately preceding said each successive instruction; wherein a first one of said processing elements which has completed execution of a multiple instruction stream in response to an instruction of said single instructions accesses and begins executing an immediately subsequent instruction of said single instructions after all other processing elements have read said instruction and before all other processing elements complete execution of respective multiple instruction streams in response to said instruction, whereby the processing elements execute the series of single instructions independently of a fixed time relationship between or among the processing elements with respect to accessing a subsequent single instruction before all processing elements have completed executing multiple instructions in response to a single instruction immediately precedent to said subsequent single instruction; and
wherein each processing element has means for broadcasting out from a node or from another processing element within the node with levels of broadcast operations which have identical functions provided, with supervisory functions reserved to the control processor.
-
-
16. An array processing system, comprising:
-
a plurality of processing elements interconnected as an array processor, each having a processor and a memory coupled to said processor, and wherein each of the processing elements selectively and automatically executes an independent instruction stream on an independent multiple data stream, thereby providing for an MIMD mode; a control processor that dispatches a series of single instructions to the plurality of processing elements, each of the single instructions operative to command the respective processing elements to execute respective multiple independent instruction streams on multiple independent data streams located one per processing element, each successive instruction of said single instructions being dispatched by said control processor in response to all of said processing elements accessing an instruction immediately preceding said each successive instruction; wherein a first one of said processing elements which has completed execution of a multiple instruction stream in response to an instruction of said single instructions accesses and begins executing an immediately subsequent instruction of said single instructions after all other processing elements have read said instruction and before all other processing elements complete execution of respective multiple instruction streams in response to said instruction, whereby the processing elements execute the series of single instructions independently of a fixed time relationship between or among the processing elements with respect to accessing a subsequent single instruction before all processing elements have completed executing multiple instructions in response to a single instruction immediately precedent to said subsequent single instruction; and
wherein each processing element has means for broadcasting out from a node or from another processing element within the node.
-
-
17. An array processing system, comprising:
-
a plurality of processing elements interconnected as an array processor, each having a processor and a memory coupled to said processor, and wherein each of the processing elements selectively and automatically executes an independent instruction stream on an independent multiple data stream, thereby providing for an MIMD mode; a control processor that dispatches a series of single instructions to the plurality of processing elements, each of the single instructions operative to command the respective processing elements to execute respective multiple independent instruction streams on multiple independent data streams located one per processing element, each successive instruction of said single instructions being dispatched by said control processor in response to all of said processing elements accessing an instruction immediately preceding said each successive instruction; wherein a first one of said processing elements which has completed execution of a multiple instruction stream in response to an instruction of said single instructions accesses and begins executing an immediately subsequent instruction of said single instructions after all other processing elements have read said instruction and before all other processing elements complete execution of respective multiple instruction streams in response to said instruction, whereby the processing elements execute the series of single instructions independently of a fixed time relationship between or among the processing elements with respect to accessing a subsequent single instruction before all processing elements have completed executing multiple instructions in response to a single instruction immediately precedent to said subsequent single instruction; and
wherein each processing element has means for broadcasting out from a node and means for receiving a broadcast during a process of executing MIMD internal instructions.
-
-
18. An array processing system, comprising:
-
a plurality of processing elements interconnected as an array processor, each having a processor and a memory coupled to said processor, and wherein each of the processing elements selectively and automatically executes an independent instruction stream on an independent multiple data stream, thereby providing for an MIMD mode; a control processor that dispatches a series of single instructions to the plurality of processing elements, each of the single instructions operative to command the respective processing elements to execute respective multiple independent instruction streams on multiple independent data streams located one per processing element, each successive instruction of said single instructions being dispatched by said control processor in response to all of said processing elements accessing an instruction immediately preceding said each successive instruction; wherein a first one of said processing elements which has completed execution of a multiple instruction stream in response to an instruction of said single instructions accesses and begins executing an immediately subsequent instruction of said single instructions after all other processing elements have read said instruction and before all other processing elements complete execution of respective multiple instruction streams in response to said instruction, whereby the processing elements execute the series of single instructions independently of a fixed time relationship between or among the processing elements with respect to accessing a subsequent single instruction before all processing elements have completed executing multiple instructions in response to a single instruction immediately precedent to said subsequent single instruction; and
wherein the control processor broadcasts blocks of instructions to a group of processing elements and replication, spreading, reduction and permutation functions are selectively executed.
-
Specification