Cascaded two-stage computational SIMD engine having multi-port memory and multiple arithmetic units
First Claim
1. A Single Instruction Multiple Data (SIMD) two-stage computational machine comprising:
- a top computational stage directly cascaded to a bottom computational stage without any intermediate intervening computational stage;
said top computational stage including;
top multi-port memory means for storing data having a plurality of inputs and a plurality of outputs; and
at least one top arithmetic unit for processing said data coupled to at least one of the top memory means outputs, and each one of said top arithmetic units having an output coupled to at least one of the top memory means inputs;
said bottom computational stage including;
bottom multi-port memory means for storing data having a plurality of inputs and a plurality of outputs, the bottom memory means inputs being directly coupled to the top arithmetic unit outputs without any intervening processor stages; and
at least one bottom arithmetic unit for processing said data coupled to at least one of the bottom memory means outputs, and each one of said bottom arithmetic units having an output coupled to at least one of the bottom memory means inputs; and
an instruction bus coupled to said top memory means, to said top arithmetic units, to said bottom memory means, and to said bottom arithmetic units, for simultaneously specifying the same single instruction to each stage of said two-stage computational machine, said single instruction simultaneously specifying a plurality of operations including all the operations of each of said top and bottom arithmetic units, and all memory means address and control operations; and
a plurality of busses for providing simultaneous plural input and output operations, said plurality of busses including;
an input bus for coupling operands from a first external operand source to said top memory means;
an output bus for coupling arithmetic unit results stored in said bottom memory means to an external destination for said results; and
at least one auxiliary data input means for optionally coupling special operands from a second external operand source to one of said top or bottom arithmetic units.
4 Assignments
0 Petitions
Accused Products
Abstract
A two-stage cascaded processor engine for Digital Signal Processing (DSP) utilizing parallel multi-port memories and a plurality of arithmetic units, including adders and multiplier-accumulators (MACs) is described. The engine supports a Single Instruction Multiple Data (SIMD) architecture. Conventional cascaded processors implementing an add-multiply-accumulate-add process for Short Length Transforms have significant limitations which are removed by the invention. The two stage processor uses two multiport memories. Arithmetic units (AU) in the top stage get their operands from a top multiport RAM and arithmetic units in the bottom stage get their operands from a bottom multiport RAM. AU outputs are stored back into the same stage as multiport RAM and passed either to the next stage or the output bus. The AU outputs can be both stored back into the same stages multiport RAM or passed either to the next stage or output multiplexer, or both of the previous. The system includes and input and output bus thus allowing simultaneous input and output operations. The AUs can also get upper ends from an auxiliary input buses to allow for operations on special data such as constant coefficients with elementary subroutines. The multiple two stage processors operate in an SIMD configuration, each processor receiving the same microcoded instruction from a microstore via a microinstruction bus. Various embodiments are described.
147 Citations
30 Claims
-
1. A Single Instruction Multiple Data (SIMD) two-stage computational machine comprising:
-
a top computational stage directly cascaded to a bottom computational stage without any intermediate intervening computational stage; said top computational stage including;
top multi-port memory means for storing data having a plurality of inputs and a plurality of outputs; and
at least one top arithmetic unit for processing said data coupled to at least one of the top memory means outputs, and each one of said top arithmetic units having an output coupled to at least one of the top memory means inputs;said bottom computational stage including;
bottom multi-port memory means for storing data having a plurality of inputs and a plurality of outputs, the bottom memory means inputs being directly coupled to the top arithmetic unit outputs without any intervening processor stages; and
at least one bottom arithmetic unit for processing said data coupled to at least one of the bottom memory means outputs, and each one of said bottom arithmetic units having an output coupled to at least one of the bottom memory means inputs; andan instruction bus coupled to said top memory means, to said top arithmetic units, to said bottom memory means, and to said bottom arithmetic units, for simultaneously specifying the same single instruction to each stage of said two-stage computational machine, said single instruction simultaneously specifying a plurality of operations including all the operations of each of said top and bottom arithmetic units, and all memory means address and control operations; and a plurality of busses for providing simultaneous plural input and output operations, said plurality of busses including;
an input bus for coupling operands from a first external operand source to said top memory means;
an output bus for coupling arithmetic unit results stored in said bottom memory means to an external destination for said results; and
at least one auxiliary data input means for optionally coupling special operands from a second external operand source to one of said top or bottom arithmetic units. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
-
-
24. A Single Instruction Multiple Data (SIMD) computational machine comprising:
-
a plurality of two stage processing machines for simultaneously processing multiple data sets; a plurality of data memory means each one coupled to one of said two stage processing machines for storing said multiple data sets; an instruction control means for controlling the operation of said two stage processing machine by simultaneously providing a single instruction to all of said two stage processing machines, said single instruction specifying a plurality of operations; an input/output control means for generating address signals to control transfer of said multiple data sets between at least two of said two stage processing machine and said data memory means; and
whereineach of said plurality of two-stage processing machines for simultaneously processing multiple data comprises; a top computational stage directly cascaded to a bottom computational stage without any intermediate intervening computational stage; said top stage including; a top multi-port memory means for storing data having a plurality of inputs and a plurality of outputs; and at least one top arithmetic unit for processing said data coupled to at least one of the top memory means outputs and each one of said top arithmetic units having an output coupled to at least one of the top memory means inputs; said bottom stage including; a bottom multi-port memory means for storing data having a plurality of inputs and a plurality of outputs, the bottom memory means inputs being directly coupled to the top arithmetic unit outputs without any intervening processor stages; and at least one bottom arithmetic unit for processing said data coupled to at least one of the bottom memory means outputs and each one of said bottom arithmetic units having an output coupled to at least one of the bottom memory means inputs; an instruction bus coupled to said top memory means, said top arithmetic units, said bottom memory means, and said bottom arithmetic units for simultaneously specifying the same single instruction to each stage of said two-stage computational machine, said single instruction simultaneously specifying a plurality of operations including all the operations of each of said arithmetic units, and all memory means address and control operations; and a plurality of busses for providing simultaneous plural input and output operations, said plurality of busses including; an input bus for coupling operands from a first external operand source to said top memory means; an output bus for coupling arithmetic unit results stored in said bottom memory means to an external destination for said results; at least one auxiliary data input means for optionally coupling special operands from a second external operand source to one of said top or bottom arithmetic units; and an arithmetic unit output signal multiplexor coupled to said plurality of bottom arithmetic unit outputs for selectively routing an output result of a selected one of said one or more bottom arithmetic units to said output bus, and wherein said single instruction further simultaneously providing additional control signals to control the output of said multiplexor; and wherein said top memory means comprises a four port RAM, said at least one top arithmetic unit comprises an adder;
said bottom memory means comprises a six port RAM, and said at least one bottom arithmetic units comprise a multiply-accumulator and an adder. - View Dependent Claims (25, 26, 27, 28)
-
-
29. A Single Instruction Multiple Data multi-stage computational machine comprising:
-
a plurality of two stage processing machines for simultaneously processing multiple data sets, each said two stage processing machine comprising a top stage having a first multi-port memory and a first arithmetic unit, and a bottom stage having a second multi-port memory and a second arithmetic unit; a plurality of data memories each one coupled to one of said two stage processing machines for storing said multiple data; an instruction controller for controlling the operation of said two stage processing machines by simultaneously providing a single instruction to all of said two stage processing machines, said single instruction specifying a plurality of operations; an input/output controller for generating address signals to control transfer of said multiple data between said two stage processing machines and said data memories.
-
-
30. A Single Instruction Multiple Data (SIMD) two-stage computational machine comprising:
-
a top computational stage directly cascaded to a bottom computational stage without any intermediate intervening computational stage; said top computational stage including;
a top multi-port memory for storing data having a plurality of inputs and a plurality of outputs; and
a plurality of top arithmetic units for processing said data coupled to at least one of the top memory outputs, and each one of said top arithmetic units having an output coupled to at least one of the top memory inputs;said bottom computational stage including;
a bottom multi-port memory for storing data having a plurality of inputs and a plurality of outputs, the bottom memory inputs being directly coupled to the top arithmetic unit outputs without any intervening processor stages; and
a plurality of bottom arithmetic unit for processing said data coupled to at least one of the bottom memory outputs, and each one of said bottom arithmetic units having an output coupled to at least one of the bottom memory inputs; andan instruction bus coupled to said top memory, to said top arithmetic units, to said bottom memory, and to said bottom arithmetic units, for simultaneously specifying the same single instruction to each stage of said two-stage computational machine, said single instruction simultaneously specifying a plurality of operations including all the operations of each of said top and bottom arithmetic units, and all memory address and control operations; each of said plurality of arithmetic units in any particular stage being coupled to said respective memory within said stage for simultaneous transmission of data stored in said memory means into each of said plurality of arithmetic units within said particular stage; a plurality of busses for providing simultaneous plural input and output operations, said plurality of busses including;
an input bus for coupling operands from a first external operand source to said top memory;
an output bus for coupling arithmetic unit results stored in said bottom memory to an external destination for said results; and
at least one auxiliary data input port for optionally coupling special operands from a second external operand source to one of said top or bottom arithmetic units;an arithmetic unit output signal multiplexor coupled to said plurality of bottom arithmetic unit outputs for selectively routing an output result of a selected one of said bottom arithmetic units to said output bus, and wherein said single instruction further simultaneously providing additional control signals to control the output of said multiplexor; an auxiliary data input bus coupled to one of said top arithmetic units or one of said bottom arithmetic units for providing special operands to said top or bottom arithmetic units; and auxiliary data storage means for storing said special operands coupled to said auxiliary data input bus.
-
Specification