Specialized processing block for programmable logic device
First Claim
1. A specialized processing block for a programmable logic device, said specialized processing block being adaptable to form a finite impulse response (FIR) filter, said specialized processing block comprising:
- a plurality of fundamental processing units, each of said fundamental processing units including;
a plurality of partial product generators, each respective one of said partial product generators providing a respective plurality of vectors representing a respective partial product;
compressor circuitry that compresses each respective plurality of vectors into a smaller number of vectors representing said respective partial product; and
circuitry for adding, in one operation, partial products represented by said smaller number of vectors produced by all of said plurality of partial product generators, each said respective partial product being unroutable to any output of said specialized processing block, thereby being unavailable for output, except after being added, by said circuitry for adding, to other of said respective partial products;
a first plurality of input registers for inputting coefficients of said FIR filter as inputs to said plurality of partial product generators;
a second plurality of input registers for inputting data to said FIR filter, said registers being chained for inputting data seriatim to each said plurality of partial product generators; and
an output stage, said output stage including;
a plurality of adders, said plurality of adders being adaptable to provide as an output a sum of (1) a multiplication operation involving two of said fundamental processing units and (2) a corresponding output cascaded from another said plurality of adders in a first other output stage in a first other one of said specialized processing blocks, andan output cascade register for registering said output for cascading to a second other output stage in a second other one of said specialized processing blocks;
wherein;
said second plurality of input registers comprises a delay register to compensate for said output cascade register when said second plurality of input registers are chained to a corresponding second plurality of input registers in said second other one of said specialized processing blocks.
3 Assignments
0 Petitions
Accused Products
Abstract
A specialized processing block for a programmable logic device incorporates a fundamental processing unit that performs a sum of two multiplications, adding the partial products of both multiplications without computing the individual multiplications. Such fundamental processing units consume less area than conventional separate multipliers and adders. The specialized processing block further has input and output stages, as well as a loopback function, to allow the block to be configured for various digital signal processing operations, including finite impulse response (FIR) filters and infinite impulse response (IIR) filters. By using the programmable connections, and in some cases the programmable resources of the programmable logic device, and by running portions of the specialized processing block at higher clock speeds than the remainder of the programmable logic device, more complex FIR and IIR filters can be implemented.
-
Citations
27 Claims
-
1. A specialized processing block for a programmable logic device, said specialized processing block being adaptable to form a finite impulse response (FIR) filter, said specialized processing block comprising:
-
a plurality of fundamental processing units, each of said fundamental processing units including; a plurality of partial product generators, each respective one of said partial product generators providing a respective plurality of vectors representing a respective partial product; compressor circuitry that compresses each respective plurality of vectors into a smaller number of vectors representing said respective partial product; and circuitry for adding, in one operation, partial products represented by said smaller number of vectors produced by all of said plurality of partial product generators, each said respective partial product being unroutable to any output of said specialized processing block, thereby being unavailable for output, except after being added, by said circuitry for adding, to other of said respective partial products; a first plurality of input registers for inputting coefficients of said FIR filter as inputs to said plurality of partial product generators; a second plurality of input registers for inputting data to said FIR filter, said registers being chained for inputting data seriatim to each said plurality of partial product generators; and an output stage, said output stage including; a plurality of adders, said plurality of adders being adaptable to provide as an output a sum of (1) a multiplication operation involving two of said fundamental processing units and (2) a corresponding output cascaded from another said plurality of adders in a first other output stage in a first other one of said specialized processing blocks, and an output cascade register for registering said output for cascading to a second other output stage in a second other one of said specialized processing blocks;
wherein;said second plurality of input registers comprises a delay register to compensate for said output cascade register when said second plurality of input registers are chained to a corresponding second plurality of input registers in said second other one of said specialized processing blocks. - View Dependent Claims (2, 3)
-
-
4. A programmable logic device adaptable to form a finite impulse response (FIR) filter, said programmable logic device comprising:
-
at least one specialized processing block, each said specialized processing block comprising; a plurality of fundamental processing units, each of said fundamental processing units including; a plurality of partial product generators, each respective one of said partial product generators providing a respective plurality of vectors representing a respective partial product; compressor circuitry that compresses each respective plurality of vectors into a smaller number of vectors representing said respective partial product; and circuitry for adding, in one operation, partial products represented by said smaller number of vectors produced by all of said plurality of partial product generators, each said respective partial product being unroutable to any output of said specialized processing block, thereby being unavailable for output, except after being added, by said circuitry for adding, to other of said respective partial products; and an output stage, said output stage including a plurality of adders, said plurality of adders being adaptable to provide as an output a sum of (1) a multiplication operation involving two of said fundamental processing units and (2) a corresponding output cascaded from another said plurality of adders in a first other output stage in a first other one of said specialized processing blocks;
each said specialized processing block further comprising;an output cascade register for registering said output for cascading to a second other output stage in a second other one of said specialized processing blocks;
said programmable logic device further comprising;a first plurality of input registers for inputting data to said FIR filter, said registers being chained for inputting data seriatim to each said plurality of partial product generators; and a delay register chained with said first plurality of input registers to compensate for said output cascade register when said first plurality of input registers are chained to a corresponding first plurality of input registers in said second other one of said specialized processing blocks. - View Dependent Claims (5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A programmable logic device adaptable to form an interpolation filter, said programmable logic device comprising:
-
at least one specialized processing block, each said specialized processing block comprising; a plurality of fundamental processing units, each of said fundamental processing units including; a plurality of partial product generators, each respective one of said partial product generators providing a respective plurality of vectors representing a respective partial product; compressor circuitry that compresses each respective plurality of vectors into a smaller number of vectors representing said respective partial product; and circuitry for adding, in one operation, partial products represented by said smaller number of vectors produced by all of said plurality of partial product generators, each said respective partial product being unroutable to any output of said specialized processing block, thereby being unavailable for output, except after being added, by said circuitry for adding, to other of said respective partial products; and an output stage, said output stage including a plurality of adders, said plurality of adders being adaptable to provide as an output a sum of (1) a multiplication operation involving at least two of said fundamental processing units and (2) a corresponding output cascaded from another said plurality of adders in a first other output stage in a first other one of said specialized processing blocks;
each said specialized processing block further comprising;an output cascade register for registering said output for cascading to a second other output stage in a second other one of said specialized processing blocks;
said programmable logic device further comprising;a first plurality of input registers for inputting data to said interpolation filter, said registers being chained for inputting data seriatim to each said plurality of partial product generators; a delay register chained with said first plurality of input registers to compensate for said output cascade register when said first plurality of input registers are chained to a corresponding first plurality of input registers in said second other one of said specialized processing blocks; and a plurality of respective second inputs to said plurality of partial product generators;
wherein;said programmable logic device has a device clock speed; said partial product generators, said circuitry for adding and said output stage operate at a second clock speed faster than said device clock speed; and during one cycle of said device clock speed, said partial product generators and said circuitry for adding process one set of data against multiple sets of coefficients on said second inputs to produce multiple sets of results that are output during said one cycle of said device clock speed.
-
-
19. A programmable logic device adaptable to form a decimation filter, said programmable logic device comprising:
-
at least one specialized processing block, each said specialized processing block comprising; a plurality of fundamental processing units, each of said fundamental processing units including; a plurality of partial product generators, each respective one of said partial product generators providing a respective plurality of vectors representing a respective partial product; compressor circuitry that compresses each respective plurality of vectors into a smaller number of vectors representing said respective partial product; and circuitry for adding, in one operation, partial products represented by said smaller number of vectors produced by all of said plurality of partial product generators, each said respective partial product being unroutable to any output of said specialized processing block, thereby being unavailable for output, except after being added, by said circuitry for adding, to other of said respective partial products; and an output stage, said output stage including a plurality of adders, said plurality of adders being adaptable to provide as an output a sum of (1) a multiplication operation involving at least two of said fundamental processing units and (2) a corresponding output cascaded from another said plurality of adders in a first other output stage in a first other one of said specialized processing blocks;
each said specialized processing block further comprising;an output cascade register for registering said output for cascading to a second other output stage in a second other one of said specialized processing blocks;
said programmable logic device further comprising;a first plurality of input registers for inputting data to said decimation filter, said registers being chained for inputting data seriatim to each said plurality of partial product generators; and a delay register chained with said first plurality of input registers to compensate for said output cascade register when said first plurality of input registers are chained to a corresponding first plurality of input registers in said second other one of said specialized processing blocks;
wherein;said programmable logic device has a device clock speed; said partial product generators and said circuitry for adding operate at a second clock speed at least four times said device clock speed; said output stage operates at a third clock speed at least twice said device clock speed; and during one cycle of said second clock speed, said partial product generators and said circuitry for adding process one set of data to produce results that are accumulated such that during one cycle of said third clock speed said partial product generators and said circuitry for adding process a plurality of sets of data;
said programmable logic device further comprising;a multiplexer upstream of said first plurality of input registers and a demultiplexer downstream of said output stage;
wherein;during one cycle of said device clock speed said partial product generators, said circuitry for adding and said output stage process a plurality of said plurality of sets of data, all of said sets of data being accumulated across cycles of said third clock speed.
-
-
20. A programmable logic device adaptable to form a finite impulse response (FIR) lattice filter, said programmable logic device comprising:
-
a plurality of specialized processing blocks, each said specialized processing block comprising; a plurality of fundamental processing units, each of said fundamental processing units including; a plurality of partial product generators, each respective one of said partial product generators providing a respective plurality of vectors representing a respective partial product; compressor circuitry that compresses each respective plurality of vectors into a smaller number of vectors representing said respective partial product; and circuitry for adding, in one operation, partial products represented by said smaller number of vectors produced by all of said plurality of partial product generators, each said respective partial product being unroutable to any output of said specialized processing block, thereby being unavailable for output, except after being added, by said circuitry for adding, to other of said respective partial products;
wherein;each of said specialized processing blocks comprises at least one register and computes one stage of said FIR lattice filter, where; each stage is represented by terms fk(n) and gk(n), k represents a stage, n represents a sample, and each of fk(n) and gk(n) is expressed in terms of fk-1(n) and gk-1(n−
1);for any kth stage, fk(n) is computed by one said fundamental processing unit forming a sum of (a) a product of (1) fk-1(n) and (2) 1, and (b) a product of (1) gk-1(n−
1) and (2) a coefficient Γ
k; andfor any kth stage, gk(n) is computed by one said fundamental processing unit forming a sum of (a) a product of (1) fk-1(n) and (2) a coefficient Γ
k, and (b) a product of (1) gk-1(n−
1) and (2) 1, and gk(n) is delayed by registration in one said at least one register to provide gk(n−
1);
wherein;fk(n) and gk(n−
1) from the kth stage are available as fk-1(n) and gk-1(n−
1) for a (k+1)th stage.
-
-
21. A programmable logic device adaptable to form an infinite impulse response (IIR) lattice filter, said programmable logic device comprising:
-
a plurality of specialized processing blocks, each said specialized processing block comprising; a plurality of fundamental processing units, each of said fundamental processing units including; a plurality of partial product generators, each respective one of said partial product generators providing a respective plurality of vectors representing a respective partial product; compressor circuitry that compresses each respective plurality of vectors into a smaller number of vectors representing said respective partial product; and circuitry for adding, in one operation, partial products represented by said smaller number of vectors produced by all of said plurality of partial product generators, each said respective partial product being unroutable to any output of said specialized processing block, thereby being unavailable for output, except after being added, by said circuitry for adding, to other of said respective partial products; and loopback circuitry for feeding back an output of said specialized processing block to an input of said specialized processing block;
wherein;each of said specialized processing blocks computes one stage of said IIR lattice filter, where; each stage is represented by terms fk(n) and gk(n), k represents a stage, n represents a sample, and each of fk(n) and gk(n) is expressed in terms of fk-1(n) and gk-1(n−
1);for any (k−
1)th stage, fk-1(n) is computed by a first said fundamental processing unit forming a sum of (a) a product of (1) fk(n), which is derived from a kth stage, and (2) 1, and (b) a product of (1) gk-1(n−
1) and (2) a coefficient −
Γ
k; andfor any kth stage, gk(n) is computed by a second said fundamental processing unit forming a sum of (a) a product of (1) fk-1(n), which is looped back from said first fundamental processing unit via said loopback circuitry, and (2) a coefficient Γ
k, and (b) a product of (1) gk-1(n−
1) and (2) 1, and gk(n) is delayed by registration to provide gk(n−
1);
wherein;fk(n) and gk(n−
1) from the kth stage are available as fk-1(n) and gk-1(n−
1) for a (k+1)th stage. - View Dependent Claims (22)
-
-
23. A specialized processing block for a programmable logic device, said specialized processing block comprising:
-
a plurality of fundamental processing units, each of said fundamental processing units including; a plurality of partial product generators, each respective one of said partial product generators providing a respective plurality of vectors representing a respective partial product; compressor circuitry that compresses each respective plurality of vectors into a smaller number of vectors representing said respective partial product; and circuitry for adding, in one operation, partial products represented by said smaller number of vectors produced by all of said plurality of partial product generators, each said respective partial product being unroutable to any output of said specialized processing block, thereby being unavailable for output, except after being added, by said circuitry for adding, to other of said respective partial products; and an output stage, said output stage including; for each pair of fundamental processing units of said specialized processing block, a corresponding pair of adders, said pair of adders being adaptable to provide as an output one of (a) an output of a multiplication operation involving both of said fundamental processing units, and (b) two separate sums, one for each respective one of said fundamental processing units, each respective sum being a sum of (1) a multiplication operation involving said fundamental processing unit and (2) a corresponding output cascaded from another said respective adder in another output stage in another one of said specialized processing blocks. - View Dependent Claims (24, 25)
-
-
26. A specialized processing block for a programmable logic device, said specialized processing block comprising:
-
a plurality of fundamental processing units, each of said fundamental processing units including; a plurality of partial product generators, each respective one of said partial product generators providing a respective plurality of vectors representing a respective partial product; compressor circuitry that compresses each respective plurality of vectors into a smaller number of vectors representing said respective partial product; and circuitry for adding, in one operation, partial products represented by said smaller number of vectors produced by all of said plurality of partial product generators, each said respective partial product being unroutable to any output of said specialized processing block, thereby being unavailable for output, except after being added, by said circuitry for adding, to other of said respective partial products; and an output stage, said output stage including; for each pair of fundamental processing units of said specialized processing block, a corresponding pair of adders, said pair of adders being adaptable to provide as an output one of (a) an output of a multiplication operation involving both of said fundamental processing units, and (b) a cascade of said output of said multiplication operation with a corresponding output of a multiplication operation from a corresponding output stage associated with two other said fundamental processing units. - View Dependent Claims (27)
-
Specification