Speech detection with noise suppression based on principal components analysis
First Claim
1. A system for suppressing background noise in audio data, comprising:
- a detector configured to perform a manipulation process on said audio data, said audio data including speech information, said detector including a speech detector configured to analyze and manipulate said speech information, wherein a first amplitude of said speech information is divided by a second amplitude of said background noise to generate a signal-to-noise ratio for said speech detector, said speech information including digital source speech data that is provided to said speech detector by an analog sound sensor and an analog-to-digital converter, wherein a filter bank generates filtered channel energy by separating said digital source speech data into discrete frequency channels, said speech detector comprising a noise suppressor, a projection module, and a weighting module, said noise suppressor including a subspace module for creating a subspace based upon said background noise, said projection module generating projected channel energy by projecting said filtered channel energy onto said subspace, said weighting module generating noise-suppressed channel energy by applying separate weighting values to each of said discrete frequency channels of said projected channel energy, said separate weighting values being proportional to said signal-to-noise ratios of said discrete frequency channels; and
a processor coupled to said system to control said detector and thereby suppress said background noise.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for effectively suppressing background noise in a speech detection system comprises a filter bank for separating source speech data into discrete frequency sub-bands to generate filtered channel energy, and a noise suppressor for weighting the frequency sub-bands to improve the signal-to-noise ratio of the resultant noise-suppressed channel energy. The noise suppressor preferably includes a subspace module for using a Karhunen-Loeve transformation to create a subspace based on the background noise, a projection module for generating projected channel energy by projecting the filtered channel energy onto the created subspace, and a weighting module for applying calculated weighting values to the projected channel energy to generate the noise-suppressed channel energy.
-
Citations
16 Claims
-
1. A system for suppressing background noise in audio data, comprising:
-
a detector configured to perform a manipulation process on said audio data, said audio data including speech information, said detector including a speech detector configured to analyze and manipulate said speech information, wherein a first amplitude of said speech information is divided by a second amplitude of said background noise to generate a signal-to-noise ratio for said speech detector, said speech information including digital source speech data that is provided to said speech detector by an analog sound sensor and an analog-to-digital converter, wherein a filter bank generates filtered channel energy by separating said digital source speech data into discrete frequency channels, said speech detector comprising a noise suppressor, a projection module, and a weighting module, said noise suppressor including a subspace module for creating a subspace based upon said background noise, said projection module generating projected channel energy by projecting said filtered channel energy onto said subspace, said weighting module generating noise-suppressed channel energy by applying separate weighting values to each of said discrete frequency channels of said projected channel energy, said separate weighting values being proportional to said signal-to-noise ratios of said discrete frequency channels; and
a processor coupled to said system to control said detector and thereby suppress said background noise. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
3. The system of claim 1 wherein said weighting module calculates a weighting value “
- wi”
for a channel “
i”
using a formula;
- wi”
-
4. The system of claim 1 wherein said noise-suppressed channel energy “
- ET”
equals a summation of said projected channel energy from each of said discrete frequency channels “
Ei”
multiplied by a corresponding one of said weighting values “
wi”
.
- ET”
-
5. The system of claim 4 wherein said noise-suppressed channel energy “
- ET”
is defined by a formula;
- ET”
-
6. The system of claim 1 wherein an endpoint detector analyzes said noise-suppressed channel energy to generate an endpoint signal.
-
7. The system of claim 6 wherein a recognizer analyzes said endpoint signal and feature vectors from a feature extractor to generate a speech detection result for said speech detector.
-
8. A method for suppressing background noise in audio data, comprising the steps of:
-
performing a manipulation process on said audio data using a detector, said audio data including speech information, said detector including a speech detector configured to analyze and manipulate said speech information, wherein a first amplitude of said speech information is divided by a second amplitude of said background noise to generate a signal-to-noise ratio for said speech detector, said speech information including digital source speech data that is provided to said speech detector by an analog sound sensor and an analog-to-digital converter, wherein a filter bank generates filtered channel energy by separating said digital source speech data into discrete frequency channels, said speech detector comprising a noise suppressor, a projection module, and a weighting module, said noise suppressor including a subspace module for creating a subspace based upon said background noise, said projection module generating projected channel energy by projecting said filtered channel energy onto said subspace, said weighting module generating noise-suppressed channel energy by applying separate weighting values to each of said discrete frequency channels of said projected channel energy, said separate weighting values being proportional to said signal-to-noise ratios of said discrete frequency channels; and
controlling said detector with a processor to thereby suppress said background noise. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
10. The method of claim 8 wherein said weighting module calculates a weighting value “
- wi”
for a channel “
i”
using a formula;
- wi”
-
11. The method of claim 8 wherein said noise-suppressed channel energy “
- ET”
equals a summation of said projected channel energy from each of said discrete frequency channels “
Ei”
multiplied by a corresponding one of said weighting values “
wi”
.
- ET”
-
12. The method of claim 11 wherein said noise-suppressed channel energy “
- ET”
is defined by a formula;
- ET”
-
13. The method of claim 8 wherein an endpoint detector analyzes said noise-suppressed channel energy to generate an endpoint signal.
-
14. The method of claim 13 wherein a recognizer analyzes said endpoint signal and feature vectors from a feature extractor to generate a speech detection result for said speech detector.
-
15. A system for suppressing background noise in audio data, comprising:
-
a detector configured to perform a manipulation process on said audio data, said audio data including speech information, said detector including a speech detector configured to analyze and manipulate said speech information, wherein a first amplitude of said speech information is divided by a second amplitude of said background noise to generate a signal-to-noise ratio for said speech detector, said speech information including digital source speech data that is provided to said speech detector by an analog sound sensor and an analog-to-digital converter, wherein a filter bank generates filtered channel energy by separating said digital source speech data into discrete frequency channels, said speech detector comprising a noise suppressor, said noise suppressor including a subspace module, a projection module, and a weighting module, said subspace module creating a subspace based upon said background noise by using a Karhunen-Loeve transformation, said projection module generating projected channel energy by projecting said filtered channel energy onto said subspace, said weighting module generating noise-suppressed channel energy by applying separate weighting values to each of said discrete frequency channels of said projected channel energy, said separate weighting values being proportional to said signal-to-noise ratios of said discrete frequency channels; and
a processor coupled to said system to control said detector and thereby suppress said background noise.
-
-
16. A method for suppressing background noise in audio data, comprising the steps of:
-
performing a manipulation process on said audio data using a detector, said audio data including speech information, said detector including a speech detector configured to analyze and manipulate said speech information, wherein a first amplitude of said speech information is divided by a second amplitude of said background noise to generate a signal-to-noise ratio for said speech detector, said speech information including digital source speech data that is provided to said speech detector by an analog sound sensor and an analog-to-digital converter, wherein a filter bank generates filtered channel energy by separating said digital source speech data into discrete frequency channels, said speech detector comprising a noise suppressor, said noise suppressor including a subspace module, a projection module, and a weighting module, said subspace module creating a subspace based upon said background noise by using a Karhunen-Loeve transformation, said projection module generating projected channel energy by projecting said filtered channel energy onto said subspace, said weighting module generating noise-suppressed channel energy by applying separate weighting values to each of said discrete frequency channels of said projected channel energy, said separate weighting values being proportional to said signal-to-noise ratios of said discrete frequency channels; and
controlling said detector with a processor to thereby suppress said background noise.
-
Specification