System and method for multi-channel multi-feature speech/noise classification for noise suppression

US 8,428,946 B1
Filed: 07/06/2012
Issued: 04/23/2013
Est. Priority Date: 07/28/2011
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented architecture for classifying an audio signal received at a multi-channel noise suppression system as speech or noise, the architecture comprising:

a first layer for generating a feature-based speech probability for each of a plurality of signal classification features measured for a frame of the signal input from each of a plurality of input channels;

a second layer for generating, for each of the plurality of input channels, a speech probability for the input channel by combining the feature-based speech probabilities of the input channel; and

a third layer for generating a combined speech probability for the frame of the signal using the speech probabilities of the plurality of input channels,wherein the layers comprise a probabilistic layered network model and an additive model or a multiplicative model is used for the third layer of the probabilistic layered network model.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An architecture and framework for speech/noise classification of an audio signal using multiple features with multiple input channels (e.g., microphones) are provided. The architecture may be implemented with noise suppression in a multi-channel environment where noise suppression is based on an estimation of the noise spectrum. The noise spectrum is estimated using a model that classifies each time/frame and frequency component of a signal as speech or noise by applying a speech/noise probability function. The speech/noise probability function estimates a speech/noise probability for each frequency and time bin. A speech/noise classification estimate is obtained by fusing (e.g., combining) data across different input channels using a layered network model. Individual feature data acquired at each channel and/or from a beam-formed signal is mapped to a speech probability, which is combined through layers of the model into a final speech/noise classification for use in noise estimation and filtering processes for noise suppression.

14 Citations

View as Search Results

20 Claims

1. A computer-implemented architecture for classifying an audio signal received at a multi-channel noise suppression system as speech or noise, the architecture comprising:
- a first layer for generating a feature-based speech probability for each of a plurality of signal classification features measured for a frame of the signal input from each of a plurality of input channels;
  
  a second layer for generating, for each of the plurality of input channels, a speech probability for the input channel by combining the feature-based speech probabilities of the input channel; and
  
  a third layer for generating a combined speech probability for the frame of the signal using the speech probabilities of the plurality of input channels,wherein the layers comprise a probabilistic layered network model and an additive model or a multiplicative model is used for the third layer of the probabilistic layered network model.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The computer-implemented architecture of claim 1, wherein the probabilistic layered network model is a Bayesian network model.
  - 3. The computer-implemented architecture of claim 1, wherein an additive model is used for the second layer of the probabilistic layered network model.
  - 4. The computer-implemented architecture of claim 1, wherein a multiplicative model is used for the second layer of the probabilistic layered network model.
  - 5. The computer-implemented architecture of claim 1, wherein the speech probability generated for each of the input channels denotes a probability of a class state of speech or noise for a layer of the probabilistic layered network model.
  - 6. The computer-implemented architecture of claim 1, wherein the feature-based speech probability generated for each of the measured signal classification features denotes a probability of a class state of speech or noise for a layer of the probabilistic layered network model.
  - 7. The computer-implemented architecture of claim 1, wherein the plurality of measured signal classification features from the plurality of input channels are input data to the probabilistic layered network model.
  - 8. The computer-implemented architecture of claim 1, wherein the combined speech is an output of the probabilistic layered network model.
  - 9. The computer-implemented architecture of claim 1, wherein one or both of the first layer and the second layer includes a set of intermediate states each denoting a class state of speech or noise.
  - 10. The computer-implemented architecture of claim 1, wherein the feature-based speech probability is a function of the measured signal classification feature, and wherein the speech probability for each of the plurality of input channels is a function of the feature-based speech probabilities for the input channel.

11. A multi-channel noise suppression system comprising:
- a plurality of input channels; and
  
  a noise suppression module configured to;
  
  measure signal classification features for an audio signal frame input from each of the plurality of input channels;
  
  calculate a feature-based speech probability for each of the measured signal classification features of each of the plurality of input channels;
  
  generate a speech probability for each of the plurality of input channels by combining the feature-based speech probabilities of the input channel; and
  
  generate a combined speech probability for the audio signal frame using at least one of the speech probabilities of the plurality of input channels and an additive model for a top layer of a probabilistic layered network model.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
- - 12. The noise suppression system of claim 11, wherein the noise suppression module is further configured to update an initial noise estimate for each of the plurality of input channels using the combined speech probability.
  - 13. The noise suppression system of claim 11, wherein the noise suppression module is further configured to:
    - combine the audio signal frames input from the plurality of input channels;
      
      measure at least one signal classification feature of the combined frames;
      
      calculate a feature-based speech probability for the combined frames using the at least one measured signal classification feature; and
      
      combine the feature-based speech probability for the combined frames with the speech probabilities generated for each of the plurality of input channels.
  - 14. The noise suppression system of claim 13, wherein the noise suppression module is further configured to combine the audio signal frames input from the plurality of input channels using beam-forming on the audio signal frames from the channels.
  - 15. The noise suppression system of claim 11, wherein the noise suppression module is further configured to generate the combined speech probability using a multiplicative model for the top layer of the probabilistic layered network model.
  - 16. The noise suppression system of claim 11, wherein the noise suppression module is further configured to, for each of the plurality of input channels, combine the feature-based speech probabilities of the input channel using an additive model for a middle layer of a probabilistic layered network model.
  - 17. The noise suppression system of claim 11, wherein each of the plurality of input channels is configured to receive either audio signals comprising noise and speech, or audio signals comprising only noise.
  - 18. The noise suppression system of claim 17, wherein the noise suppression module is further configured to generate a combined speech probability using the speech probabilities of the input channels configured to receive audio signals comprising noise and speech.
  - 19. The noise suppression system of claim 11, wherein the noise suppression module is further configured to:
    - assign one or more weighting terms to the speech probabilities of the plurality of input channels, the one or more weighting terms being assigned based on one or more conditions; and
      
      generate the combined speech probability using the speech probabilities of the plurality of input channels with the one or more weighting terms assigned.

20. A method for classifying an audio signal received at a noise suppression module via a plurality of input channels as speech or noise, the method comprising:
- measuring, for each of the plurality of channels, signal classification features for a frame of the signal input from the channel;
  
  determining, for each of the measured signal classification features of each of the plurality of channels, a first classification state for the signal based on the measured signal classification feature;
  
  determining, for each of the plurality of channels, a second classification state for the signal by combining the first classification states of the channel using a probabilistic layered network model with an additive model as a top layer; and
  
  classifying the signal as speech or noise based on the second classification states of the plurality of channels.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google Inc. (Alphabet Inc.)
Inventors
Paniconi, Marco
Primary Examiner(s)
Smits, Talivaldis Ivars
Assistant Examiner(s)
ROBERTS, SHAUN A

Application Number

US13/543,460
Time in Patent Office

291 Days
Field of Search

704/202, 704/226, 704/232, 704/233, 381/71.1, 381/94.1
US Class Current

704/233
CPC Class Codes

G10L 2021/02166   Microphone arrays; Beamforming

G10L 21/0216   characterised by the method...

G10L 21/0232   Processing in the frequency...

G10L 25/84   for discriminating voice fr...

System and method for multi-channel multi-feature speech/noise classification for noise suppression

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

14 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for multi-channel multi-feature speech/noise classification for noise suppression

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

14 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links