SIMD/MIMD array processor with vector processing
First Claim
1. An array processing system, comprising:
- a plurality of processing elements interconnected as an array processor, each having a processor and a memory coupled to said processor, and wherein each of the processing elements selectively and autonomously executes an independent instruction stream on an independent multiple data stream, thereby providing for a MIMD mode; and
a control processor that dispatches a series of single instructions to the plurality of processing elements, each of the single instructions operative to command the respective processing elements to execute respective multiple instruction streams on multiple independent data streams located one per processing element, each successive instruction of said single instructions being dispatched by said control processor in response to all of said processing elements accessing an instruction immediately preceding said each successive instruction;
wherein a first one of said processing elements which has completed execution of a multiple instruction stream in response to an instruction of said single instructions accesses and begins executing an immediately subsequent instruction of said single instructions after all other processing elements have read said instruction and before all other processing elements complete execution of respective multiple instruction streams in response to said instruction, whereby the processing elements execute the series of single instructions independently of a fixed time relationship between or among the processing elements with respect to accessing a subsequent single instruction before all processing elements have completed executing multiple instructions in response to a single instruction immediately precedent to said subsequent single instruction.
0 Assignments
0 Petitions
Accused Products
Abstract
A parallel array processor for massively parallel applications is formed with low power CMOS with DRAM processing while incorporating processing elements on a single chip. Eight processors on a single chip have their own associated processing element, significant memory, and I/O and are interconnected with a hypercube based, but modified, topology. These nodes are then interconnected, either by a hypercube, modified hypercube, or ring, or ring within ring network topology. Conventional microprocessor MMPs consume pins and time going to memory. The new architecture merges processor and memory with multiple PMEs (eight 16 bit processors with 32K and I/O) in DRAM and has no memory access delays and uses all the pins for networking. The chip can be a single node of a fine-grained parallel processor. Each chip will have eight 16 bit processors, each processor providing 5 MIPs performance. I/O has three internal ports and one external port shared by the plural processors on the chip. Significant software flexibility is provided to enable quick implementation of existing programs written in common languages. The scalable chip PME has internal and external connections for broadcast and asynchronous SIMD, MIMD and SIMIMD (SIMD/MIMD) with dynamic switching of modes. The chip can be used in systems which employ 32, 64 or 128,000 processors. Local and global memory functions can all be provided by the chips themselves, and the system can connect to and support other global memories and DASD.
-
Citations
10 Claims
-
1. An array processing system, comprising:
-
a plurality of processing elements interconnected as an array processor, each having a processor and a memory coupled to said processor, and wherein each of the processing elements selectively and autonomously executes an independent instruction stream on an independent multiple data stream, thereby providing for a MIMD mode; and a control processor that dispatches a series of single instructions to the plurality of processing elements, each of the single instructions operative to command the respective processing elements to execute respective multiple instruction streams on multiple independent data streams located one per processing element, each successive instruction of said single instructions being dispatched by said control processor in response to all of said processing elements accessing an instruction immediately preceding said each successive instruction; wherein a first one of said processing elements which has completed execution of a multiple instruction stream in response to an instruction of said single instructions accesses and begins executing an immediately subsequent instruction of said single instructions after all other processing elements have read said instruction and before all other processing elements complete execution of respective multiple instruction streams in response to said instruction, whereby the processing elements execute the series of single instructions independently of a fixed time relationship between or among the processing elements with respect to accessing a subsequent single instruction before all processing elements have completed executing multiple instructions in response to a single instruction immediately precedent to said subsequent single instruction. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. An array processing system, comprising:
-
a plurality of processing elements interconnected as an array processor, each having a processor and a memory coupled to said processor, and wherein each of the processing elements selectively and autonomously executes an independent instruction stream on an independent multiple data stream, thereby providing for a MIMD mode; a control processor that selectively dispatches a single instruction stream to the plurality of processing elements to command the processing elements to execute multiple independent instruction streams stored in respective processing elements on multiple independent data streams located one per processing element; and wherein said processing elements selectively execute a vector command broadcast by said control processor to provide for a vector operation on one or more vectors each having elements distributed in at least one of the processing elements, said vector command selected from a plurality of vector commands which include at least one command which results in a vector operation requiring coordination or direct communication among at least two processing elements to execute said vector operation. - View Dependent Claims (8, 9, 10)
-
Specification