Multipulse excited pole-zero filtering approach for noise reduction

US 5,007,094 A
Filed: 04/07/1989
Issued: 04/09/1991
Est. Priority Date: 04/07/1989
Status: Expired due to Term

First Claim

Patent Images

1. A method of encoding speech comprising;

estimating an excitation pulse train from an original speech signal;

estimating a pole-zero filter;

applying the excitation pulse train to the estimated pole-zero filter to synthesize a speech signal; and

modifying coefficients of the pole-zone filter based on an error between the original speech signal and the synthesized speech signal.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A pulse train of primary pulses is estimated from an inverse LPC analysis of a frame of voiced speech. From this estimated pulse train a pole-zero filter is estimated. The estimated pulse train is used to excite the estimated pole-zero filter to produce a synthesized speech signal. The synthesized speech signal is compared to the original frame of speech to determine the error in the original speech signal. Both the pulse amplitude and filter are adjusted to compensate for the error and another synthesized speech signal is produced. The process may be repeated until the synthesized speech signal and original speech signal converge.

18 Citations

31 Claims

1. A method of encoding speech comprising;
- estimating an excitation pulse train from an original speech signal;
  
  estimating a pole-zero filter;
  
  applying the excitation pulse train to the estimated pole-zero filter to synthesize a speech signal; and
  
  modifying coefficients of the pole-zone filter based on an error between the original speech signal and the synthesized speech signal.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. A method as claimed in claim 1 wherein the step of estimating an excitation pulse train results in a train of only primary pulses which are of nonconstant pitch.
  - 3. A method as claimed in claim 1 wherein the step of estimating an excitation pulse train comprises performing a linear predictive coding (LPC) analysis and detecting peaks above a threshold in a residual signal obtained from the LPC analysis.
  - 4. A method as claimed in claim 3 wherein the step of estimating the excitation pulse train further comprises a procedure to locate pitch pulses by examining a small sample of pulses near a largest pulse of an estimated pitch period.
  - 5. A method as claimed in claim 3 further comprising the step of modifying amplitudes of the pulse train based on the error between the original speech signal and the synthesized speech signal.
  - 6. A method as claimed in claim 5 further comprising the step of extracting secondary pulses using the pole-zero filter obtained in the step of modifying the estimate of the pole-zero filter.
  - 7. A method as claimed in claim 1 further comprising the step of modifying amplitudes of the pulse train based on the error between the original speech signal and the synthesized speech signal.
  - 8. A method as claimed in claim 1 further comprising the step of extracting secondary pulses using the pole-zero filter obtained in the step of modifying the estimate of the pole-zero filter.

9. A method of encoding speech comprising:
- estimating an excitation pulse train from an original speech signal such that the pulse train is of nonconstant pitch, said estimating step comprising performing a linear predictive coding (LPC) analysis and detecting peaks above a threshold in a residual signal obtained from the LPC analysis; and
  
  estimating a pole-zero filter to which the excitation pulse train may be applied to synthesize a speech signal simulating the original speech signal.

10. A method of encoding speech comprising:
- (a) providing an estimated excitation pulse train from an original speech signal using LPC analysis such that the LPC analysis produces estimated pitch periods for the excitation pulse train;
  
  (b) locating largest pulses within the estimated pitch periods of the excitation pulse train;
  
  (c) for each estimated pitch period, comparing amplitudes of pulses located near the largest pulse of the pitch period to locate a pitch pulse that is encoded as the pitch pulse for the pitch period.
- View Dependent Claims (11)
- - 11. A method as claimed in claim 10 wherein the step of estimating the excitation pulse train comprises a procedure to detect significant change in prediction error when multiple peaks surround a pitch pulse.

12. A method of noise reduction for speech processing comprising the steps of:
- a. performing Linear Predictive Coding (LPC) analysis on an original speech signal to produce a residual signal;
  
  b. extracting a pulse train from the residual signal;
  
  c. finding best pole-zero filter using a prediction error identification technique that selects a best set of coefficients for the filter;
  
  d. extracting secondary pulses from the residual signal; and
  
  e. convolving the pulse train and the secondary pulses via the best pole-zero filter to produce a clean speech signal.
- View Dependent Claims (13, 14, 15)
- - 13. A method as recited in claim 12 wherein the step of extracting the pulse train locations comprises:
    - a. squaring the residual signal;
      
      b. identifying a largest peak of the squared residual signal;
      
      c. detecting peaks of the squared residual signal that are larger than a threshold relative to a largest peak; and
      
      d. locating pulses by a procedure that extracts pitch pulses.
  - 14. A method as recited in claim 12 wherein the step of finding a best pole-zero filter comprises:
    - a. estimating amplitudes of pulses in the pulse train;
      
      b. estimating the best pole-zero filter for the pulses and exciting the best pole-zero filter estimate with the estimated pulses to produce a synthesized signal;
      
      c. determining an amount of error between the synthesized speech signal and the original speech signal;
      
      d. determining if there is a convergence between the original speech signal and the synthesized speech signal based on the amount of error;
      
      e. if there is no convergence,updating the best pole-zero filter estimate to minimize the amount of error by altering the coefficients of the filter;
      
      repeating steps b through e; and
      
      f. if there is a convergence, denoting the best pole-zero filter estimate as the best pole-zero filter.
  - 15. A method as recited in claim 12 wherein the step of extracting secondary pulses comprises employing a multipulse technique using the best pole-zero filter to extract secondary pulses.

16. A method of noise reduction for speech processing comprising the steps of:
- a. filtering an original speech signal through an all-poles Linear Predictive Coding (LPC) filter to produce a residual signal;
  
  b. extracting a pulse train form the residual signal by;
  
  squaring the residual signal;
  
  identifying a largest peak of the squared residual signal;
  
  detecting peaks of the squared residual signal that are larger than a threshold relative to the largest peak;
  
  c. finding a best pole-zero mixed phase filter by;
  
  estimating amplitudes of pulses in the pulse train;
  
  estimating the best pole-zero filter by selecting a set of coefficients and exciting the best pole-zero filter estimate with the estimated pulse amplitudes to produce a synthesized speech signal;
  
  applying a prediction error identification technique to determine an amount of error between the synthesized speech signal and the original speech signal;
  
  determining if there is a convergence between the original speech signal and the synthesized speech signal based on the amount of error;
  
  if there is no convergence, repeating steps b through e;
  
  if there is a convergence,denoting the best pole-zero filter estimate as the best pole-zero filter;
  
  d. extracting secondary pulses from the residual signal by employing a multipulse technique that uses the best pole-zero filter to extract the secondary pulses; and
  
  e. convolving the the pulse train and the secondary pulses via the best pole-zero filter to produce a clean speech signal.

17. A method of determining a best pole-zero filter to accurately model an original speech signal from a pulse train extracted out of a Linear Predictive Coding (LPC) residual signal, comprising the steps of:
- a. estimating amplitudes of pulses in the pulse train;
  
  b. estimating the best pole-zero filter by selecting a set of coefficients for the filter and exciting the best pole-zero filter estimate with the estimated pulse amplitudes to produce a synthesized signal;
  
  c. determining an amount of error between the synthesized speech signal and the original speech signal;
  
  d. determining if there is a convergence between the original speech signal and the synthesized speech signal based on the amount of error;
  
  e. if there is no convergence,updating the best pole-zero filter estimate to minimize the amount of error;
  
  repeating steps b through e; and
  
  f. if there is a convergence, denoting the best pole-zero filter estimate as the best pole-zero filter.

18. A procedure for locating pitch pulses in a multipulse set of pulse samples comprising the steps of:
- a. placing a small window that views pulse samples immediately preceding a largest detected peak in the set of pulse samples;
  
  b. computing an average relative magnitude of the pulses in the window relative to the largest peak;
  
  c. comparing the magnitude of each pulse sample in the window to the average relative magnitude;
  
  d. designating the pulse sample whose relative magnitude is much greater than the average relative magnitude as the pitch pulse;
  
  e. moving the small window to a next pulse sample; and
  
  f. repeating steps a-e until all samples in the set of samples have been examined.
- View Dependent Claims (19, 20, 22)
- - 19. A method as recited in claim 18 wherein the step of moving to a next pulse sample comprises:
    - obtaining a pitch period estimate from an LPC analysis of the set of pulse samples;
      
      moving to a location a pitch period away from the previously found pitch pulse location;
      
      examining a guard-band centered at the location a pitch period away to find the largest pulse in the guard-band; and
      
      placing the small window immediately proceeding the largest pulse in the guard-band.
  - 20. A method as recited in claim 19 wherein the guard-band cover those pulse samples within a large percentage of the pitch period.
  - 22. The system of claim 18 wherein the system is employed in telephone lines.

21. A speech enhancement system comprising a processor means;
- wherein the processor means comprisesa. an inverse all-poles Linear Predictive Coding (LPC) analysis unit for producing residual signals from incoming multipulse frames of speech;
  
  b. a best pole-zero mixed-phase filter for producing clean speech signals from the residual signals;
  
  wherein the incoming multipulse frames of speech enter the inverse all-poles LPC filter to produce residual signals that are processed by the processor means which updates the best pole-zero mixed-phase filter so that the filter may filter the residual signals to produce clean speech signals.

23. A method of encoding speech comprising:
- estimating an excitation pulse train from an original speech signal;
  
  estimating a pole-zero filter by selecting a set of coefficients for the filter;
  
  modifying the estimate of the excitation pulse train and the estimate of the pole-zero filter to minimize the expected error between the original speech signal and a speech signal to be synthesized when the excitation pulse train is applied to the estimated pole-zero filter.
- View Dependent Claims (24, 25, 26)
- - 24. A method as recited in claim 23 wherein the step of estimating an excitation pulse train results in a train of only primary pulses which are of nonconstant pitch.
  - 25. A method as recited in claim 23 wherein the step of estimating the excitation pulse train further comprises a procedure to locate pitch pulses by examining a small sample of pulses near a largest pulse of an estimated pitch period.
  - 26. A method as recited in claim 23 wherein the step of modifying the estimate of the excitation pulse train comprises modifying the amplitudes of the excitation pulse train.

27. A method of encoding speech comprising the steps of:
- estimating an excitation pulse train having primary pulses of non-constant pitch from an original speech signal;
  
  estimating a pole-zero filter by selecting a set of coefficients for the filter;
  
  modifying the estimate of the excitation pulse train by modifying the amplitudes of the excitation pulse train and modifying the estimate of the pole-zero filter to minimize the expected error between the original speech signal and a speech signal to be synthesized when the excitation pulse train is applied to the estimated pole-zero filter.
- View Dependent Claims (28)
- - 28. A method as recited in claim 27 further comprising the step of applying the excitation pulse train to the estimated pole-zero filter to synthesize a speech signal.

29. A method of encoding speech comprising the steps of:
- estimating an excitation pulse train from the original speech signal;
  
  estimating a pole-zero filter by selecting a set of coefficients for the filter;
  
  applying the excitation pulse train to the estimated pole-zero filter to synthesize a speech signal; and
  
  modifying an estimate of the excitation pulse train and the estimate of the pole-zero filter based on an error between the original speech signal and the synthesized speech signal.
- View Dependent Claims (30, 31)
- - 30. A method as recited in claim 29 wherein the step of estimating an excitation pulse train results in a train of only primary pulses which are of non-constant pitch.
  - 31. A method as recited in claim 30 further comprising the step of modifying amplitudes of the pulse train based on teh error between the original speech signal and the syntehsized speech signal.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Verizon Laboratories Incorporated (Verizon Communications Inc.)
Original Assignee
GTE Products Corporation
Inventors
Chuang, Chiu-Kuang, Hsueh, A-Chuan
Primary Examiner(s)
KEMENY, EMANUEL

Application Number

US07/335,142
Time in Patent Office

732 Days
Field of Search

381/47
US Class Current

704/226
CPC Class Codes

G10L 19/10 the excitation function bei...

G10L 21/0208 Noise filtering

Multipulse excited pole-zero filtering approach for noise reduction

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

18 Citations

31 Claims

Specification

Use Cases

Quick Links

Others

Multipulse excited pole-zero filtering approach for noise reduction

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

18 Citations

31 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others