Integrated frequency-domain voice coding using an adaptive spectral enhancement filter

US 6,070,137 A
Filed: 01/07/1998
Issued: 05/30/2000
Est. Priority Date: 01/07/1998
Status: Expired due to Term

First Claim

Patent Images

1. A system for encoding voice with integrated noise suppression, comprising:

a sampler which converts an analog audio signal into frames of time-domain audio samples;

a voice activity detector operatively coupled to the sampler for determining presence or absence of speech in a current frame;

a transformer operatively coupled to the sampler for transforming the frame of time-domain audio samples to a frequency-domain representation;

a noise model adapter operatively associated with the voice activity detector and the transformer for updating a noise model using a current frame if the voice activity detector determines there is an absence of speech;

a transformer and filter creator operatively coupled to the transformer and the noise model adaptor to create a noise suppression filter; and

a spectral estimator operatively coupled to the transformer and the transformer and filter creator to remove noise characteristics from the frequency-domain representation of the current frame using the noise suppression filter and to develop a set of spectral magnitudes.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system for encoding voice while suppressing acoustic background noise and a method for suppressing acoustic background noise in a voice encoder are described herein. The voice encoder includes a sampler that captures frames of time-domain samples of an audio signal. A voice activity detector operatively coupled to the sampler determines presence or absence of speech in the current frame. A transformer is operatively coupled to the sampler for transforming the frame of time-domain audio samples into an estimate of the power spectrum of that frame. A noise model adapter operatively associated with the transformer updates a frequency-domain noise model based on the power spectrum estimate of the current frame if the voice activity detector indicates an absence of speech in this frame. A filter computation block operatively coupled to the noise model adapter and the transform computes a spectral enhancement (noise suppression) filter based on the current power spectrum estimate and the adapted noise model. A spectral enhancement block operatively coupled to the transformer and the filter computation block applies the spectral enhancement filter to the current power spectrum estimate. A quantizer and encoder block transforms the voice encoder model parameters, including the enhanced spectral magnitudes, into a frame of encoded bits.

Citations

43 Claims

1. A system for encoding voice with integrated noise suppression, comprising:
- a sampler which converts an analog audio signal into frames of time-domain audio samples;
  
  a voice activity detector operatively coupled to the sampler for determining presence or absence of speech in a current frame;
  
  a transformer operatively coupled to the sampler for transforming the frame of time-domain audio samples to a frequency-domain representation;
  
  a noise model adapter operatively associated with the voice activity detector and the transformer for updating a noise model using a current frame if the voice activity detector determines there is an absence of speech;
  
  a transformer and filter creator operatively coupled to the transformer and the noise model adaptor to create a noise suppression filter; and
  
  a spectral estimator operatively coupled to the transformer and the transformer and filter creator to remove noise characteristics from the frequency-domain representation of the current frame using the noise suppression filter and to develop a set of spectral magnitudes.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
- - 2. The system of claim 1 wherein said transformer comprises a Discrete Fourier Transform (DFT) that computes a complex spectrum at uniformly spaced discrete frequency points from the frame of audio samples.
  - 3. The system of claim 2 wherein said DFT is computed with a Fast Fourier Transform.
  - 4. The system of claim 1 wherein an output of the transformer comprises a sampled PSD estimate and wherein the transformer and filter creator comprises:
    - a transform pair for converting between a domain of the noise model adaptor and the domain of the sampled PSD estimate;
      
      a variance reducer for smoothing the sampled PSD estimate of the current audio frame; and
      
      a filter creator for computing a noise suppression filter.
  - 5. The system of claim 4 wherein the filter creator computes said noise suppression filter using the PSD estimate of the noise and the PSD estimate of the current frame.
  - 6. The system of claim 4 wherein the variance reducer smooths the PSD estimate of the current frame in the frequency domain before being used to compute the noise suppression filter.
  - 7. The system of claim 6 wherein the variance reducer smooths the PSD estimate of the current frame using a moving average filter operating on the PSD estimate.
  - 8. The system of claim 1 wherein the noise model adaptor stores a vector of noise model parameters.
  - 9. The system of claim 8 wherein the noise model parameters are stored in the same format as a sampled PSD estimate of the current frame output from the transformer.
  - 10. The system of claim 9 wherein the noise model is stored using the same number of points as the PSD estimate, but wherein the value stored represents square roots of the values actually used in the PSD estimate.
  - 11. The system of claim 9 wherein the noise model is stored using the same number of points as the PSD estimate, but wherein the values stored represent the logarithms of the values used in the PSD estimate.
  - 12. The system of claim 9 wherein the noise model is comprised of a set of spectral magnitudes, said magnitudes being equally spaced in the frequency domain and the set comprising a smaller number of magnitudes than the PSD estimate.
  - 13. The system of claim 9 wherein the noise model is comprised of a set of spectral magnitudes, the magnitudes being logarithmically spaced in the frequency domain and the set comprising a smaller number of magnitudes than the PSD estimate.
  - 14. The system of claim 8 wherein the vector of noise model parameters is comprised of a time domain model such as an autocorrelation function (ACF) or a set of linear prediction coefficients (LPCs).
  - 15. The system of claim 1 wherein the noise model adaptor is operative to provide long-term smoothing of noise model parameters.
  - 16. The system of claim 15 wherein said smoothing is implemented by means of an auto-regressive, moving average, or a combination auto-regressive moving average filter.
  - 17. The system of claim 1 wherein the spectral estimator includes a spectral enhancer which applies a noise suppression filter to a PSD estimate of the current audio frame, creating an enhanced PSD estimate.
  - 18. The system of claim 17 wherein the spectral estimator includes a spectral magnitude estimator which accepts as input the enhanced PSD estimate and computes a set of spectral magnitudes.

19. A system for encoding voice with integrated noise suppression, comprising:
- a sampler which converts an analog audio signal into frames of time-domain audio samples;
  
  a voice activity detector operatively coupled to the sampler for determining presence or absence of speech in a current frame;
  
  a transformer operatively coupled to the sampler for transforming the frame of time-domain audio samples to a frequency-domain representation;
  
  a noise model adapter operatively associated with the voice activity detector and the transformer for updating a noise model using a current frame if the voice activity detector determines there is an absence of speech;
  
  a transformer and filter creator operatively coupled to the transformer and the noise model adaptor to create a noise suppression filter;
  
  a spectral estimator operatively coupled to the transformer and the noise model adaptor to remove noise characteristics from the frequency-domain representation of the current frame and to develop a set of spectral magnitudes; and
  
  a quantizer and encoder for transforming the developed spectral magnitudes into a frame of encoded bits.

20. A system for encoding voice with integrated noise suppression, comprising:
- a sampler which converts an analog audio signal into frames of time-domain audio samples;
  
  a voice activity detector operatively coupled to the sampler for determining presence or absence of speech in a current frame;
  
  a transformer operatively coupled to the sampler for transforming the frame of time-domain audio samples to a frequency-domain representation;
  
  a noise model adapter operatively associated with the voice activity detector and the transformer for updating a noise model using a current frame if the voice activity detector determines there is an absence of speech;
  
  a transformer and filter creator operatively coupled to the transformer and the noise model adaptor to create a noise suppression filter; and
  
  a spectral estimator operatively coupled to the transformer and the noise model adaptor to remove noise characteristics from the frequency-domain representation of the current frame and to develop a set of spectral magnitudes,wherein the system comprises a multi-band excitation voice encoder.

21. A system for encoding voice with integrated noise suppression, comprising:
- a sampler which converts an analog audio signal into frames of time-domain audio samples;
  
  a voice activity detector operatively coupled to the sampler for determining presence or absence of speech in a current frame;
  
  a transformer operatively coupled to the sampler for transforming the frame of time-domain audio samples to a frequency-domain representation;
  
  a noise model adapter operatively associated with the voice activity detector and the transformer for updating a noise model using a current frame if the voice activity detector determines there is an absence of speech;
  
  a transformer and filter creator operatively coupled to the transformer and the noise model adaptor to create a noise suppression filter; and
  
  a spectral estimator operatively coupled to the transformer and the noise model adaptor to remove noise characteristics from the frequency-domain representation of the current frame using the noise suppression filter and to develop a set of spectral magnitudes,wherein the system comprises a sinusoidal transform voice encoder.

22. A system for encoding voice with integrated noise suppression, comprising:
- a sampler which converts an analog audio signal into frames of time-domain audio samples;
  
  a voice activity detector operatively coupled to the sampler for determining presence or absence of speech in a current frame;
  
  a transformer operatively coupled to the sampler for transforming the frame of time-domain audio samples to a frequency-domain representation;
  
  a noise model adapter operatively associated with the voice activity detector and the transformer for updating a noise model using a current frame if the voice activity detector determines there is an absence of speech, the noise model adapter storing a vector of noise model parameters;
  
  a transformer and filter creator operatively coupled to the transformer and the noise model adaptor to create a noise suppression filter; and
  
  a spectral estimator operatively coupled to the transformer and the noise model adaptor to remove noise characteristics from the frequency-domain representation of the current frame and to develop a set of spectral magnitudes,wherein the voice encoder comprises a multi-band excitation (MBE) voice encoder and wherein the noise model is stored in the same format as the spectral magnitudes of the MBE model.

23. A system for encoding voice with integrated noise suppression, comprising:
- a sampler which converts an analog audio signal into frames of time-domain audio samples;
  
  a detector operatively coupled to the sampler for determining presence or absence of speech in a current frame;
  
  a transformer operatively coupled to the sampler for transforming the frame of time-domain audio samples to a frequency-domain representation;
  
  a noise model adapter operatively associated with the voice activity detetor and the transformer for updating a noise model using a current frame if the voice activity detector determines there is an absence of speech;
  
  a transformer and filter creator operatively coupled to the transformer and the noise model adapter to convert between a domain of the noise model adapter and the frequency-domain representation and to create a noise suppression filter;
  
  a spectral estimator operatively coupled to the transformer and the noise model adaptor to remove noise characteristics from the frequency-domain representation of the current frame using the noise suppression filter; and
  
  an encoder transformer coupled to the spectral estimator for transforming the frequency-domain representation of the current frame, having noise characteristics removed, into a frame of encoded bits.

24. A method of suppressing noise in a voice encoder, comprising the steps of:
- converting a received analog audio signal into frames of time-domain audio samples;
  
  determining presence or absence of speech in a current frame of the time-domain audio samples;
  
  transforming the frame time-domain audio samples to a frequency-domain representation;
  
  updating a noise model using the transformed current frame if there is an absence of speechcreating a noise suppression filter from the frequency-domain representation; and
  
  removing noise characteristics from the frequency-domain representation of the current frame using the noise suppression filter and developing a set of spectral magnitudes.
- View Dependent Claims (25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41)
- - 25. The method of claim 24 wherein said transforming step uses a Discrete Fourier Transform (DFT) that computes a complex spectrum at uniformly spaced discrete frequency points from the frame of audio samples.
  - 26. The method of claim 25 wherein said DFT is computed with a Fast Fourier Transform.
  - 27. The method of claim 24 wherein the transforming step develops a sampled PSD estimate and wherein the creating step uses:
    - a transform pair for converting between the domain of the noise model and the domain of the sampled PSD estimate;
      
      a variance reducer for smoothing the sampled PSD estimate of the current frame; and
      
      a filter creator for computing a noise suppression filter.
  - 28. The method of claim 27 wherein the filter creator computes said noise suppression filter using the PSD estimate of noise and the PSD estimate of the current frame.
  - 29. The method of claim 27 wherein the variance reducer smooths the PSD estimate of the current frame in the frequency domain before being used to compute the noise suppression filter.
  - 30. The method of claim 29 wherein the variance reducer smooths the PSD estimate of the current frame using a moving average filter operating on the PSD estimate.
  - 31. The method of claim 24 wherein the updating step stores a vector of noise model parameters.
  - 32. The method of claim 31 wherein the noise model parameters are stored in the same format as a sampled PSD estimate of the current audio frame developed in the transforming step.
  - 33. The method of claim 32 wherein the noise model is stored using the same number of points as the PSD estimate, but wherein the value stored represents square roots of the values actually used in the PSD estimate.
  - 34. The method of claim 32 wherein the noise model is stored using the same number of points as the PSD estimate, but wherein the values stored represent the logarithms of the values used in the PSD estimate.
  - 35. The method of claim 32 wherein the noise model is a set of spectral magnitudes, said magnitudes being equally spaced in the frequency domain and the set comprising a smaller number of magnitudes then the PSD estimate.
  - 36. The method of claim 32 wherein the noise model is a set of spectral magnitudes, the magnitudes being logarithmically spaced in the frequency domain and the set comprising a smaller number of magnitudes than the PSD estimate.
  - 37. The method of claim 31 wherein the vector of noise model parameters is comprised of a time domain model such as an auto-correlation function (ACF) or a set of linear prediction coefficients (LPCs).
  - 38. The method of claim 24 wherein the updating step provides long-term smoothing of noise model parameters.
  - 39. The method of claim 38 wherein said smoothing is implemented by means of an auto-regressive, moving average, or a combination auto-regressive moving average filter.
  - 40. The method of claim 24 wherein the removing step uses a spectral enhancer which applies a noise suppression filter to a PSD estimate of the current audio frame, creating an enhanced PSD estimate.
  - 41. The method of claim 40 wherein the spectral estimator includes a spectral magnitude estimator which accepts as input the enhanced PSD estimate and computes a set of spectral magnitudes.

42. Method of suppressing noise in a voice encoder, comprising the steps of:
- converting a received analog audio signal into frames of time-domain audio samples;
  
  determining presence or absence of speech in a current frame of the time-domain audio samples;
  
  transforming the frame time-domain audio samples to a frequency-domain representation;
  
  updating a noise model using the transformed current frame if there is an absence of speech;
  
  creating a noise suppression filter from the frequency-domain representation;
  
  removing noise characteristics from the frequency-domain representation of the current frame and developing a set of spectral magnitudes; and
  
  transforming the developed spectral magnitudes into a frame of encoded bits.

43. A method of suppressing noise in a voice encoder, comprising the steps of:
- converting a received analog audio signal into frames of time-domain audio samples;
  
  determining presence or absence of speech in a current frame of the time-domain audio samples;
  
  transforming the frame time-domain audio samples to a frequency-domain representation;
  
  updating a noise model using the transformed current frame if there is an absence of speech, wherein the updating step stores a vector of noise model parameters;
  
  creating a noise suppression filter from the frequency-domain representation; and
  
  removing noise characteristics from the frequency-domain representation of the current frame and developing a set of spectral magnitudes,wherein the voice encoder comprises a multi-band excitation (MBE) voice encoder and wherein the noise model is stored in the same format as the spectral magnitudes of the MBE model.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Ericsson, Inc. (Telefonaktiebolaget LM Ericsson)
Original Assignee
Ericsson, Inc. (Telefonaktiebolaget LM Ericsson)
Inventors
Johnson, Phillip M., Bloebaum, Leland S.
Primary Examiner(s)
Hudspeth, David R.
Assistant Examiner(s)
Smits, Talivaldis Ivars

Application Number

US09/003,967
Time in Patent Office

874 Days
Field of Search

704/205, 704/226, 704/227, 381/94.3
US Class Current

704/227
CPC Class Codes

G10L 19/02   using spectral analysis, e....

G10L 19/10   the excitation function bei...

G10L 21/0208   Noise filtering

Integrated frequency-domain voice coding using an adaptive spectral enhancement filter

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

43 Claims

Specification

Solutions

Use Cases

Quick Links

Integrated frequency-domain voice coding using an adaptive spectral enhancement filter

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

43 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links