Speech detection with noise suppression based on principal components analysis

US 6,230,122 B1
Filed: 10/21/1998
Issued: 05/08/2001
Est. Priority Date: 09/09/1998
Status: Expired due to Fees

First Claim

Patent Images

1. A system for suppressing background noise in audio data, comprising:

a detector configured to perform a manipulation process on said audio data, said audio data including speech information, said detector including a speech detector configured to analyze and manipulate said speech information, wherein a first amplitude of said speech information is divided by a second amplitude of said background noise to generate a signal-to-noise ratio for said speech detector, said speech information including digital source speech data that is provided to said speech detector by an analog sound sensor and an analog-to-digital converter, wherein a filter bank generates filtered channel energy by separating said digital source speech data into discrete frequency channels, said speech detector comprising a noise suppressor, a projection module, and a weighting module, said noise suppressor including a subspace module for creating a subspace based upon said background noise, said projection module generating projected channel energy by projecting said filtered channel energy onto said subspace, said weighting module generating noise-suppressed channel energy by applying separate weighting values to each of said discrete frequency channels of said projected channel energy, said separate weighting values being proportional to said signal-to-noise ratios of said discrete frequency channels; and

a processor coupled to said system to control said detector and thereby suppress said background noise.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for effectively suppressing background noise in a speech detection system comprises a filter bank for separating source speech data into discrete frequency sub-bands to generate filtered channel energy, and a noise suppressor for weighting the frequency sub-bands to improve the signal-to-noise ratio of the resultant noise-suppressed channel energy. The noise suppressor preferably includes a subspace module for using a Karhunen-Loeve transformation to create a subspace based on the background noise, a projection module for generating projected channel energy by projecting the filtered channel energy onto the created subspace, and a weighting module for applying calculated weighting values to the projected channel energy to generate the noise-suppressed channel energy.

Citations

16 Claims

1. A system for suppressing background noise in audio data, comprising:
- a detector configured to perform a manipulation process on said audio data, said audio data including speech information, said detector including a speech detector configured to analyze and manipulate said speech information, wherein a first amplitude of said speech information is divided by a second amplitude of said background noise to generate a signal-to-noise ratio for said speech detector, said speech information including digital source speech data that is provided to said speech detector by an analog sound sensor and an analog-to-digital converter, wherein a filter bank generates filtered channel energy by separating said digital source speech data into discrete frequency channels, said speech detector comprising a noise suppressor, a projection module, and a weighting module, said noise suppressor including a subspace module for creating a subspace based upon said background noise, said projection module generating projected channel energy by projecting said filtered channel energy onto said subspace, said weighting module generating noise-suppressed channel energy by applying separate weighting values to each of said discrete frequency channels of said projected channel energy, said separate weighting values being proportional to said signal-to-noise ratios of said discrete frequency channels; and
  
  a processor coupled to said system to control said detector and thereby suppress said background noise.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The system of claim 1 wherein said weighting module calculates a weighting value “
    - w_i”
      
      for a channel “
      
      i”
      
      using a formula;
3. The system of claim 1 wherein said weighting module calculates a weighting value “
- w_i”
  
  for a channel “
  
  i”
  
  using a formula;
4. The system of claim 1 wherein said noise-suppressed channel energy “
- E_T”
  
  equals a summation of said projected channel energy from each of said discrete frequency channels “
  
  E_i”
  
  multiplied by a corresponding one of said weighting values “
  
  w_i”.
5. The system of claim 4 wherein said noise-suppressed channel energy “
- E_T”
  
  is defined by a formula;
6. The system of claim 1 wherein an endpoint detector analyzes said noise-suppressed channel energy to generate an endpoint signal.
7. The system of claim 6 wherein a recognizer analyzes said endpoint signal and feature vectors from a feature extractor to generate a speech detection result for said speech detector.

8. A method for suppressing background noise in audio data, comprising the steps of:
- performing a manipulation process on said audio data using a detector, said audio data including speech information, said detector including a speech detector configured to analyze and manipulate said speech information, wherein a first amplitude of said speech information is divided by a second amplitude of said background noise to generate a signal-to-noise ratio for said speech detector, said speech information including digital source speech data that is provided to said speech detector by an analog sound sensor and an analog-to-digital converter, wherein a filter bank generates filtered channel energy by separating said digital source speech data into discrete frequency channels, said speech detector comprising a noise suppressor, a projection module, and a weighting module, said noise suppressor including a subspace module for creating a subspace based upon said background noise, said projection module generating projected channel energy by projecting said filtered channel energy onto said subspace, said weighting module generating noise-suppressed channel energy by applying separate weighting values to each of said discrete frequency channels of said projected channel energy, said separate weighting values being proportional to said signal-to-noise ratios of said discrete frequency channels; and
  
  controlling said detector with a processor to thereby suppress said background noise.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The method of claim 8 wherein said weighting module calculates a weighting value “
    - w_i”
      
      for a channel “
      
      i”
      
      using a formula;
10. The method of claim 8 wherein said weighting module calculates a weighting value “
- w_i”
  
  for a channel “
  
  i”
  
  using a formula;
11. The method of claim 8 wherein said noise-suppressed channel energy “
- E_T”
  
  equals a summation of said projected channel energy from each of said discrete frequency channels “
  
  E_i”
  
  multiplied by a corresponding one of said weighting values “
  
  w_i”.
12. The method of claim 11 wherein said noise-suppressed channel energy “
- E_T”
  
  is defined by a formula;
13. The method of claim 8 wherein an endpoint detector analyzes said noise-suppressed channel energy to generate an endpoint signal.
14. The method of claim 13 wherein a recognizer analyzes said endpoint signal and feature vectors from a feature extractor to generate a speech detection result for said speech detector.

15. A system for suppressing background noise in audio data, comprising:
- a detector configured to perform a manipulation process on said audio data, said audio data including speech information, said detector including a speech detector configured to analyze and manipulate said speech information, wherein a first amplitude of said speech information is divided by a second amplitude of said background noise to generate a signal-to-noise ratio for said speech detector, said speech information including digital source speech data that is provided to said speech detector by an analog sound sensor and an analog-to-digital converter, wherein a filter bank generates filtered channel energy by separating said digital source speech data into discrete frequency channels, said speech detector comprising a noise suppressor, said noise suppressor including a subspace module, a projection module, and a weighting module, said subspace module creating a subspace based upon said background noise by using a Karhunen-Loeve transformation, said projection module generating projected channel energy by projecting said filtered channel energy onto said subspace, said weighting module generating noise-suppressed channel energy by applying separate weighting values to each of said discrete frequency channels of said projected channel energy, said separate weighting values being proportional to said signal-to-noise ratios of said discrete frequency channels; and
  
  a processor coupled to said system to control said detector and thereby suppress said background noise.

16. A method for suppressing background noise in audio data, comprising the steps of:
- performing a manipulation process on said audio data using a detector, said audio data including speech information, said detector including a speech detector configured to analyze and manipulate said speech information, wherein a first amplitude of said speech information is divided by a second amplitude of said background noise to generate a signal-to-noise ratio for said speech detector, said speech information including digital source speech data that is provided to said speech detector by an analog sound sensor and an analog-to-digital converter, wherein a filter bank generates filtered channel energy by separating said digital source speech data into discrete frequency channels, said speech detector comprising a noise suppressor, said noise suppressor including a subspace module, a projection module, and a weighting module, said subspace module creating a subspace based upon said background noise by using a Karhunen-Loeve transformation, said projection module generating projected channel energy by projecting said filtered channel energy onto said subspace, said weighting module generating noise-suppressed channel energy by applying separate weighting values to each of said discrete frequency channels of said projected channel energy, said separate weighting values being proportional to said signal-to-noise ratios of said discrete frequency channels; and
  
  controlling said detector with a processor to thereby suppress said background noise.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Sony Corporation (Sony Group Corp.), Sony Electronics Inc. (Sony Group Corp.)
Original Assignee
Sony Corporation (Sony Group Corp.), Sony Electronics Inc. (Sony Group Corp.)
Inventors
Amador-Hernandez, Mariscela, Wu, Duanpei, Tanaka, Miyuki
Primary Examiner(s)
Korzuch, William R.
Assistant Examiner(s)
Storm, Donald L.

Application Number

US09/176,178
Time in Patent Office

930 Days
Field of Search

704/233, 704/226, 704/227, 704/204
US Class Current

704/226
CPC Class Codes

G10L 21/0208 Noise filtering

G10L 21/0232 Processing in the frequency...

Speech detection with noise suppression based on principal components analysis

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

16 Claims

Specification

Solutions

Use Cases

Quick Links

Speech detection with noise suppression based on principal components analysis

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

16 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links