Signal noise reduction using magnitude-domain spectral subtraction

US 6,804,640 B1
Filed: 02/29/2000
Issued: 10/12/2004
Est. Priority Date: 02/29/2000
Status: Expired due to Term

First Claim

Patent Images

1. A method of reducing noise in data representing a speech signal, the method comprising:

inputting the data representing a speech signal;

for each of a plurality of frequency components of the speech signal, computing a first scale factor as a function of an absolute noise level of noise associated with the speech signal such that the first scale factor is not a function of the speech signal, and computing a second scale factor as a function of a signal-to-noise ratio associated with the speech signal, wherein the first scale factor and the second scale factor are each based on a sigmoid function;

for each of the plurality of frequency components of the speech signal, computing a noise scale factor from both the first scale factor and the second scale factor;

for each of the plurality of frequency components of the speech signal, scaling a measure of noise in the speech signal by the corresponding noise scale factor; and

for each of the plurality of frequency components, subtracting noise from the data based on the corresponding scaled measure of noise.

View all claims

5 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and apparatus for generating a noise-reduced feature vector representing human speech are provided. Speech data representing an input speech waveform are first input and filtered. Spectral energies of the filtered speech data are determined, and a noise reduction process is then performed. In the noise reduction process, a spectral magnitude is computed for a frequency index of multiple frequency indexes. A noise magnitude estimate is then determined for the frequency index by updating a histogram of spectral magnitude, and then determining the noise magnitude estimate as a predetermined percentile of the histogram. A signal-to-noise ratio is then determined for the frequency index. A scale factor is computed for the frequency index, as a function of the signal-to-noise ratio and the noise magnitude estimate. The noise magnitude estimate is then scaled by the scale factor. The scaled noise magnitude estimate is subtracted from the spectral magnitudes of the filtered speech data, to produce cleaned speech data, based on which a feature vector is generated.

67 Citations

View as Search Results

8 Claims

1. A method of reducing noise in data representing a speech signal, the method comprising:
- inputting the data representing a speech signal;
  
  for each of a plurality of frequency components of the speech signal, computing a first scale factor as a function of an absolute noise level of noise associated with the speech signal such that the first scale factor is not a function of the speech signal, and computing a second scale factor as a function of a signal-to-noise ratio associated with the speech signal, wherein the first scale factor and the second scale factor are each based on a sigmoid function;
  
  for each of the plurality of frequency components of the speech signal, computing a noise scale factor from both the first scale factor and the second scale factor;
  
  for each of the plurality of frequency components of the speech signal, scaling a measure of noise in the speech signal by the corresponding noise scale factor; and
  
  for each of the plurality of frequency components, subtracting noise from the data based on the corresponding scaled measure of noise.
- View Dependent Claims (2, 3, 4)
- - 2. A method as recited in claim 1, wherein the noise scale factor is defined such that:
3. A method as recited in claim 1, wherein the noise scale factor K is defined based substantially on the formula:
4. A method as recited in claim 1, wherein the noise scale factor is defined such that during an initial stage of said removing, the noise scale factor is influenced primarily by the absolute noise level and not the signal-to-noise ratio.

5. A method of generating a noise-reduced feature vector representing human speech, the method comprising:
- receiving speech data representing an input speech waveform;
  
  determining a plurality of spectral energies of the speech data;
  
  removing noise in the input speech waveform by, for each of a plurality of frequency components, computing a spectral magnitude, computing a noise magnitude estimate, determining a signal-to-noise ratio, computing an overall noise scale factor as a function of a first noise scale factor and a second noise scale factor, wherein the first noise scale factor is a function of an absolute noise level of noise associated with the speech signal but not a function of the speech signal, and wherein the second noise scale factor is a function of a signal-to-noise ratio associated with the speech signal, and wherein the first scale factor and the second scale factor are each based on a sigmoid function, scaling a noise magnitude estimate according to a product of the noise magnitude estimate and the overall noise scale factor, and modifying the spectral magnitude using the scaled noise magnitude estimate, to produce cleaned speech data; and
  
  generating a feature vector based on the cleaned speech data.

6. A method of generating a noise-reduced feature vector representing human speech, the method comprising:
- receiving speech data representing an input speech waveform;
  
  filtering the speech data;
  
  determining a plurality of spectral energies of the filtered speech data;
  
  removing noise in the input speech waveform based on the spectral energies by, for each of a plurality of frequency indexes computing a spectral magnitude for the frequency index based on the spectral energy for the frequency index;
  
  computing a noise magnitude estimate for the frequency index by updating a histogram of spectral magnitude, and determining the noise magnitude estimate as a predetermined percentile of the histogram;
  
  determining a signal-to-noise ratio for the frequency index;
  
  computing a noise scale factor for the frequency index, including computing a first scale factor based on the signal-to-noise ratio, computing a second scale factor based on the noise magnitude estimate, and computing the noise scale factor as a function of the first scale factor and the second scale factor;
  
  multiplying the noise magnitude estimate by the noise scale factor to produce a scaled noise magnitude estimate, and subtracting the scaled noise magnitude estimate from the spectral magnitude, to produce cleaned speech data; and
  
  generating a feature vector based on the cleaned speech data.

7. An apparatus for reducing noise in data representing an audio signal, the apparatus comprising:
- means for inputting the data representing an audio signal;
  
  means for computing, for each of a plurality of frequency components of the speech signal, a first scale factor as a function of an absolute noise level of noise associated with the speech signal and a second scale factor as a function of a signal-to-noise ratio associated with the speech signal, wherein the first scale factor and the second scale factor are each based on a sigmoid function;
  
  means for computing, for each of the plurality of frequency components of the speech signal, a noise scale factor from both the first scale factor and the second scale factor;
  
  means for scaling, for each of a plurality of frequency components of the audio signal, a measure of noise in the audio signal by the noise scale factor; and
  
  means for subtracting, for each of the plurality of frequency components, noise from the data based on the corresponding scaled measure of noise.

8. A speech recognition system comprising:
- a dictionary model;
  
  an acoustic model;
  
  a grammar model;
  
  front end circuitry configured to generate feature vectors based on input speech data in an input speech signal, the front end further configured to remove noise from the input speech data for purposes of generating the feature vectors by, computing a spectral magnitude of the speech data for each of a plurality of frequency components of the speech data, computing an estimate of noise in the data for each of the plurality of frequency components, computing a first noise scale factor as function of an absolute noise level of noise associated with the speech signal but not as a function of the speech signal, wherein the first scale factor is based on a sigmoid function, computing a second noise scale factor as a function of a signal-to-noise ratio associated with the speech signal, wherein the second scale factor is based on a sigmoid function, computing an overall noise scale factor as a function of the first noise scale factor and the second noise scale factor, scaling the estimate of noise by the overall scale factor to produce a scaled noise estimate, for each of the plurality of frequency components, subtracting the scaled noise estimate from the spectral magnitude to produce cleaned speech data for each of the plurality of frequency components, and generating the feature vectors based on the cleaned speech data;
  
  a decoder configured to receive said one or more feature vectors and to output data representing recognized speech based on the feature vectors and contents of the dictionary model, the acoustic model, and the grammar model.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Inventors
Beaufays, Francoise, Weintraub, Mitchel
Primary Examiner(s)
Dorvil, Richemond
Assistant Examiner(s)
Lerner, Martin

Application Number

US09/515,252
Time in Patent Office

1,687 Days
Field of Search

704/224, 704/225, 704/226, 704/227, 704/228, 704/205, 704/233, 381/94.1, 381/94.2, 381/94.3, 381/94.7
US Class Current

704/226
CPC Class Codes

G10L 15/20   Speech recognition techniqu...

G10L 19/0204   using subband decomposition

G10L 21/0208   Noise filtering

G10L 21/0216   characterised by the method...

H04L 1/20   using signal quality detector

Signal noise reduction using magnitude-domain spectral subtraction

First Claim

5 Assignments

0 Petitions

Accused Products

Abstract

67 Citations

8 Claims

Specification

Solutions

Use Cases

Quick Links

Signal noise reduction using magnitude-domain spectral subtraction

First Claim

5 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

67 Citations

8 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links