System and method for multi-channel multi-feature speech/noise classification for noise suppression
First Claim
1. A method for noise estimation and filtering based on classifying an audio signal received at a noise suppression module via a plurality of input channels as speech or noise, the method comprising:
- measuring signal classification features for a frame of the audio signal input from each of the plurality of input channels;
generating a feature-based speech probability for each of the measured signal classification features of each of the plurality of input channels;
generating, for each of the plurality of input channels, a speech probability for the input channel by combining the feature-based speech probabilities of the input channel using an additive model for a middle layer of a probabilistic layered network model;
generating a combined speech probability over the plurality of input channels using the speech probabilities of the input channels;
classifying the audio signal as speech or noise based on the combined speech probability; and
updating an initial noise estimate for each of the plurality of input channels using the combined speech probability.
2 Assignments
0 Petitions
Accused Products
Abstract
An architecture and framework for speech/noise classification of an audio signal using multiple features with multiple input channels (e.g., microphones) are provided. The architecture may be implemented with noise suppression in a multi-channel environment where noise suppression is based on an estimation of the noise spectrum. The noise spectrum is estimated using a model that classifies each time/frame and frequency component of a signal as speech or noise by applying a speech/noise probability function. The speech/noise probability function estimates a speech/noise probability for each frequency and time bin. A speech/noise classification estimate is obtained by fusing (e.g., combining) data across different input channels using a layered network model. Individual feature data acquired at each channel and/or from a beam-formed signal is mapped to a speech probability, which is combined through layers of the model into a final speech/noise classification for use in noise estimation and filtering processes for noise suppression.
16 Citations
39 Claims
-
1. A method for noise estimation and filtering based on classifying an audio signal received at a noise suppression module via a plurality of input channels as speech or noise, the method comprising:
-
measuring signal classification features for a frame of the audio signal input from each of the plurality of input channels; generating a feature-based speech probability for each of the measured signal classification features of each of the plurality of input channels; generating, for each of the plurality of input channels, a speech probability for the input channel by combining the feature-based speech probabilities of the input channel using an additive model for a middle layer of a probabilistic layered network model; generating a combined speech probability over the plurality of input channels using the speech probabilities of the input channels; classifying the audio signal as speech or noise based on the combined speech probability; and updating an initial noise estimate for each of the plurality of input channels using the combined speech probability. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39)
-
Specification