Method for suppressing noise in a digital speech signal

US 6,477,489 B1
Filed: 06/05/2000
Issued: 11/05/2002
Est. Priority Date: 09/18/1997
Status: Expired due to Fees

First Claim

Patent Images

1. Method of suppressing noise in a digital speech signal processed by successive frames, comprising the steps of:

computing spectral components of the speech signal of each frame;

computing, for each frame, overestimates of spectral components of noise included in the speech signal; and

performing a spectral subtraction including a first subtraction step in which a respective first quantity dependent on parameters including the overestimate of a corresponding spectral component of the noise for said frame is subtracted from each spectral component of the speech signal of the frame, to obtain spectral components of a first noise-suppressed signal;

computing a masking curve by applying an auditory perception model on the basis of the spectral components of the first noise-suppressed signal;

comparing the overestimates of the spectral components of the noise for the frame to the computed masking curve; and

a second subtraction step in which a respective second quantity depending on parameters including a difference between the overestimate of the corresponding spectral component of the noise and the computed masking curve is subtracted from each spectral component of the speech signal of the frame.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A spectral subtraction is effected including: a first subtraction step in which overestimates of the spectral component of the noise are taken into account, to obtain spectral components of a first noise-suppressed signal; the computation of a masking curve by applying an auditory perception model on the basis of the spectral components of the first noise-suppressed signal; and a second subtraction step in which a respective quantity depending on parameters including a difference between the overestimate of the corresponding spectral component of the noise and the computed masking curve is subtracted from each spectral component of the speech signal in the frame. The result of the spectral subtraction is transformed into the time domain to construct a noise-suppressed speech signal.

64 Citations

View as Search Results

21 Claims

1. Method of suppressing noise in a digital speech signal processed by successive frames, comprising the steps of:
- computing spectral components of the speech signal of each frame;
  
  computing, for each frame, overestimates of spectral components of noise included in the speech signal; and
  
  performing a spectral subtraction including a first subtraction step in which a respective first quantity dependent on parameters including the overestimate of a corresponding spectral component of the noise for said frame is subtracted from each spectral component of the speech signal of the frame, to obtain spectral components of a first noise-suppressed signal;
  
  computing a masking curve by applying an auditory perception model on the basis of the spectral components of the first noise-suppressed signal;
  
  comparing the overestimates of the spectral components of the noise for the frame to the computed masking curve; and
  
  a second subtraction step in which a respective second quantity depending on parameters including a difference between the overestimate of the corresponding spectral component of the noise and the computed masking curve is subtracted from each spectral component of the speech signal of the frame.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
- - 2. Method according to claim 1, wherein said second quantity relating to a spectral component of the speech signal of the frame is substantially equal to whichever is the lower of the corresponding first quantity and a fraction of the overestimate of the corresponding spectral component of the noise which exceeds the masking curve.
  - 3. Method according to claim 1, comprising the step of performing a harmonic analysis of the speech signal to estimate a pitch frequency of the speech signal in each frame in which the speech signal features vocal activity.
  - 4. Method according to claim 3, wherein the parameters on which the first subtracted quantities depend include the estimated pitch frequency.
  - 5. Method according to claim 4, wherein the first quantity subtracted from a spectral component of the speech signal is lower if said spectral component corresponds to a frequency closest to an integer multiple of the estimated pitch frequency than if said spectral component does not correspond to a frequency closest to an integer multiple of the estimated pitch frequency.
  - 6. Method according to claim 4, wherein the respective quantities subtracted from the spectral components of the speech signal corresponding to frequencies closest to integer multiples of the estimated pitch frequency are substantially zero.
  - 7. Method according to claim 3, wherein, after estimating the pitch frequency of the speech signal in a frame, the speech signal of the frame is conditioned by oversampling the speech signal at an oversampling frequency which is a multiple of the estimated pitch frequency and the spectral components of the speech signal are computed for the frame on the basis of the conditioned signal to subtract said quantities therefrom.
  - 8. Method according to claim 7, wherein spectral components of the speech signal are computed by distributing the conditioned signal into blocks of N samples transformed into the frequency domain and wherein the ratio between the oversampling frequency and the estimated pitch frequency is a factor of the number N.
  - 9. Method according to claim 7, wherein a degree of voicing of the speech signal is estimated for the frame on the basis of an entropy of an autocorrelation of the spectral components computed on the basis of the conditioned signal.
  - 10. Method according to claim 9, wherein said spectral components whose autocorrelation is computed are those computed on the basis of the conditioned signal after subtraction of said first quantities.
  - 11. Method according to claim 9, wherein the degree of voicing is measured on the basis of a normalized entropy of the form:
    - $H = \frac{\sum_{k = 0}^{N / 2 - 1} A (k) \cdot \log [A (k)]}{\log (N / 2)}$
12. Method according to claim 11, wherein the computation of the masking curve uses the degree of voicing measured by the normalized entropy H.
13. Method according claim 3, wherein, after processing each frame, a number of the samples of the noise-suppressed speech signal supplied by such processing is retained which is equal to an integer multiple of a ratio between the sampling frequency and the estimated pitch frequency.
14. Method according to claim 3, wherein the estimation of the pitch frequency of the speech signal over a frame includes the steps of:
- estimating time intervals between two consecutive breaks of the signal which can be attributed to glottal closures of the speaker occurring during the frame, the estimated pitch frequency being inversely proportional to said time intervals; and
  
  interpolating the speech signal in said time intervals so that the conditioned signal resulting from such interpolation has a constant time interval between two consecutive breaks.
15. Method according to claim 14, wherein, after processing each frame, a number of the noise-suppressed speech signal samples supplied by such processing is retained which corresponds to an integer number of estimated time intervals.
16. Method according to claim 1, wherein values of a signal-to-noise ratio of the speech signal are estimated in the spectral domain for each frame and the parameters on which the first subtracted quantities depend include the estimated values of the signal-to-noise ratio, the first quantity subtracted from each spectral component of the speech signal in the frame being a decreasing function of the corresponding estimated value of the signal-to-noise ratio.
17. Method according to claim 16, wherein said function decreases toward zero for the highest values of the signal-to-noise ratio.
18. Method according to claim 1, further comprising the step of subjecting a result of the spectral subtraction to a transformation to the time domain to construct a noise-suppressed speech signal.

19. Device for suppressing noise in a digital speech signal processed by successive frames, comprising:
- means for computing spectral components of the speech signal for each frame;
  
  means for computing, for each frame, overestimates of spectral components of noise included in the speech signal; and
  
  spectral subtraction means including;
  
  first subtraction means to subtract, from each spectral component of the speech signal of the frame, a respective first quantity dependent on parameters including the overestimate of a corresponding spectral component of the noise for said frame, to obtain spectral components of a first noise-suppressed signal;
  
  means for computing a masking curve by applying an auditory perception model on the basis of the spectral components of the first noise-suppressed signal;
  
  means for comparing the overestimates of the spectral components of the noise for the frame to the computed masking curve; and
  
  second subtraction means to subtract, from each spectral component of the speech signal of the frame, a respective second quantity depending on parameters including a difference between the overestimate of the corresponding spectral component of the noise and the computed masking curve.
- View Dependent Claims (20, 21)
- - 20. Device according to claim 19, wherein said second quantity relating to a spectral component of the speech signal of the frame is substantially equal to whichever is the lower of the corresponding first quantity and a fraction of the overestimate of the corresponding spectral component of the noise which exceeds the masking curve.
  - 21. Device according to claim 19, further comprising harmonic analysis means for estimating a pitch frequency of the speech signal in each frame in which said speech signal features vocal activity, and wherein the parameters on which the first subtracted quantities depend include the estimated pitch frequency.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nortel Networks Corporation
Original Assignee
Nortel Networks Corporation
Inventors
Lubiarz, Stéphane, Lockwood, Philip
Primary Examiner(s)
Banks-Harold, Marsha D.
Assistant Examiner(s)
Lerner, Martin

Application Number

US09/509,145
Time in Patent Office

883 Days
Field of Search

704/200.1, 704/205, 704/207, 704/210, 704/226, 704/227, 704/228, 704/208, 381/94.1, 381/94.2, 381/94.3, 381/94.7
US Class Current

704/200.1
CPC Class Codes

G10L 21/0208   Noise filtering

G10L 21/0232   Processing in the frequency...

G10L 21/0264   characterised by the type o...

Method for suppressing noise in a digital speech signal

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

64 Citations

21 Claims

Specification

Solutions

Use Cases

Quick Links

Method for suppressing noise in a digital speech signal

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

64 Citations

21 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links