Hardware accelerator for normal least-mean-square algorithm-based coefficient adaptation
First Claim
1. A system for accelerating least-mean-square algorithm-based coefficient adaptation, comprising:
- a data memory for storing an input signal;
a coefficient memory for storing a coefficient vector;
a multiplication and accumulation unit for reading the input signal from the data memory and the coefficient vector from the coefficient memory to perform convolution; and
a coefficient adaptation unit separate from the multiplication and accumulation unit for reading the input signal from the data memory and for reading the coefficient vector from the coefficient memory to perform coefficient adaptation at the same time that the multiplication and accumulation unit performs the reading to produce an adapted coefficient vector which is written back into the coefficient memory for use by the multiplication and accumulation unit during a next iteration of convolution to produce an output signal, wherein each tap is executed in one machine clock cycle; and
the coefficient memory includes an even coefficient memory and an odd coefficient memory, each storing half of the coefficient vector.
2 Assignments
0 Petitions
Accused Products
Abstract
A system and method for accelerating least-mean-square algorithm-based coefficient adaptation which executes in one machine clock cycle one tap of the least-mean-square algorithm including data fetch, coefficient fetch, coefficient adaptation, convolution, and write-back of a new coefficient vector. A data memory stores an input signal. A coefficient memory stores a coefficient vector. A multiplication and accumulation unit reads the input signal from the data memory and the coefficient vector from the coefficient memory to perform convolution. A coefficient adaptation unit separate from the multiplication and accumulation unit reads the input signal from the data memory and reads the coefficient vector from the coefficient memory to perform coefficient adaptation at the same time that the multiplication and accumulation unit performs the reading to produce an adapted coefficient vector which is written back into the coefficient memory for use by the multiplication and accumulation unit during a next iteration of convolution to produce an output signal, wherein each tap is executed in one machine clock cycle.
58 Citations
10 Claims
-
1. A system for accelerating least-mean-square algorithm-based coefficient adaptation, comprising:
-
a data memory for storing an input signal;
a coefficient memory for storing a coefficient vector;
a multiplication and accumulation unit for reading the input signal from the data memory and the coefficient vector from the coefficient memory to perform convolution; and
a coefficient adaptation unit separate from the multiplication and accumulation unit for reading the input signal from the data memory and for reading the coefficient vector from the coefficient memory to perform coefficient adaptation at the same time that the multiplication and accumulation unit performs the reading to produce an adapted coefficient vector which is written back into the coefficient memory for use by the multiplication and accumulation unit during a next iteration of convolution to produce an output signal, wherein each tap is executed in one machine clock cycle; and
the coefficient memory includes an even coefficient memory and an odd coefficient memory, each storing half of the coefficient vector. - View Dependent Claims (2, 3, 4, 5)
tnew(i+1)=told(i+1)±
(convergence factor)*x(i);
wherein told(i+1) is an old coefficient vector to be adapted, tnew(i+1) is a new coefficient vector after adaptation, and x(i) is the input signal.
-
-
5. The system as set forth in claim 1, wherein, for each iteration of convolution, an updated coefficient vector is used during each tap.
-
6. A method for accelerating least-mean-square algorithm-based coefficient adaptation, comprising the steps of:
-
(a) dividing a coefficient memory into an even coefficient memory and an odd coefficient memory, each storing half of a coefficient vector;
(b) storing an input signal in a data memory;
(c) storing the coefficient vector in the coefficient memory;
(d) reading the input signal from the data memory and the coefficient vector from the coefficient memory to perform convolution; and
(e) reading the input signal from the data memory and reading the coefficient vector from the coefficient memory to perform coefficient adaptation at the same time as the reading of step (d) to produce an adapted coefficient vector which is written back into the coefficient memory for use in a next iteration of convolution to produce an output signal, wherein each tap is executed in one machine clock cycle. - View Dependent Claims (7, 8, 9, 10)
tnew(i+1)=told(i+1)±
(convergence factor)*x(i);
wherein told(i+1) is an old coefficient vector to be adapted, tnew(i+1) is a new coefficient vector after adaptation, and x(i) is the input signal.
-
-
10. The method as set forth in claim 6, wherein, for each iteration of convolution, an updated coefficient vector is used during each tap.
Specification