Monaural noise suppression based on computational auditory scene analysis

US 8,447,596 B2
Filed: 08/20/2010
Issued: 05/21/2013
Est. Priority Date: 07/12/2010
Status: Active Grant

First Claim

Patent Images

1. A method for performing noise reduction, the method comprising:

executing a program stored in a memory to transform a time-domain acoustic signal into a plurality of frequency-domain sub-band signals;

tracking multiple pitched sources within a sub-band signal in the plurality of sub-band signals, the tracking including;

calculating transition probabilities for associations of existing pitch tracks to new pitch candidates,determining a largest of the transition probabilities, andforming associations between the existing pitch tracks and the new pitch candidates according to the largest of the transition probabilities;

generating a speech model and one or more noise models based on the tracked pitch sources; and

performing noise reduction on the sub-band signal based on the speech model and the one or more noise models.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present technology provides a robust noise suppression system that may concurrently reduce noise and echo components in an acoustic signal while limiting the level of speech distortion. An acoustic signal may be received and transformed to cochlear domain sub-band signals. Features, such as pitch, may be identified and tracked within the sub-band signals. Initial speech and noise models may be then be estimated at least in part from a probability analysis based on the tracked pitch sources. Speech and noise models may be resolved from the initial speech and noise models and noise reduction may be performed on the sub-band signals. An acoustic signal may be reconstructed from the noise-reduced sub-band signals.

Citations

20 Claims

1. A method for performing noise reduction, the method comprising:
- executing a program stored in a memory to transform a time-domain acoustic signal into a plurality of frequency-domain sub-band signals;
  
  tracking multiple pitched sources within a sub-band signal in the plurality of sub-band signals, the tracking including;
  
  calculating transition probabilities for associations of existing pitch tracks to new pitch candidates,determining a largest of the transition probabilities, andforming associations between the existing pitch tracks and the new pitch candidates according to the largest of the transition probabilities;
  
  generating a speech model and one or more noise models based on the tracked pitch sources; and
  
  performing noise reduction on the sub-band signal based on the speech model and the one or more noise models.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1, wherein tracking includes tracking the multiple pitched sources across successive frames of the sub-band signal.
  - 3. The method of claim 1, wherein tracking includes:
    - calculating at least one feature for each pitched source in the multiple pitched sources; and
      
      determining a probability for each pitched source that the pitched source is a speech source.
  - 4. The method of claim 3, wherein the probability is based at least in part on pitch energy level, pitch salience, and pitch stationarity.
  - 5. The method of claim 1, further comprising generating a speech model and a noise model from the multiple pitch tracks.
  - 6. The method of claim 1, wherein generating a speech model and one or more noise models includes combining the multiple models.
  - 7. The method of claim 1, wherein a noise model is not updated for a sub-band in a current frame when speech is dominant in a previous frame or is not updated in the current frame when speech is dominant in the current frame for the sub-band.
  - 8. The method of claim 1, wherein noise reduction is performed using an optimal filter.
  - 9. The method of claim 8, wherein the optimal filter is based on a least squares formulation.
  - 10. The method of claim 1, wherein transforming the acoustic signal includes performing a fast cochlea transformation after delaying the acoustic signal.

11. A system for performing noise reduction in an audio signal, the system comprising:
- a memory;
  
  an analysis module stored in the memory and executed by a processor to transform a time-domain acoustic signal to frequency-domain sub-band signals;
  
  a source inference engine stored in the memory and executed by a processor to track multiple sources of pitch within the sub-band signals and to generate a speech model and one or more noise models based on the tracked pitch sources, the tracking including;
  
  calculating transition probabilities for associations of existing pitch tracks to new pitch candidates,determining a largest of the transition probabilities, andforming associations between the existing pitch tracks and the new pitch candidates according to the largest of the transition probabilities; and
  
  a modifier module stored in the memory and executed by a processor to perform noise reduction on the sub-band signals based on the speech model and one or more noise models.
- View Dependent Claims (12, 13, 14, 15, 16)
- - 12. The system of claim 11, the source inference engine executable to calculate at least one feature for each pitch source and determine a probability for each speech source that the speech source is the speech.
  - 13. The system of claim 11, the source inference engine executable to generate a speech model and a noise model from the pitch tracks.
  - 14. The system of claim 11, the source inference engine executable to not update a noise model for a sub-band in a current frame when speech is dominant in a previous frame or not update a noise model for a sub-band in a current frame when speech is dominant in the current frame for the sub-band.
  - 15. The system of claim 11, the modifier module executable to apply a first-order filter to each sub-band in each frame.
  - 16. The system of claim 11, the analysis module executable to convert the acoustic signal by performing a fast cochlea transformation after delaying the acoustic signal.

17. A non-transitory computer readable storage medium having embodied thereon a program, the program being executable by a processor to perform a method for reducing noise in an audio signal, the method comprising:
- transforming an acoustic signal from a time-domain signal to frequency-domain sub-band signals;
  
  tracking multiple sources of pitch within the sub-band signals, the tracking including;
  
  calculating transition probabilities for associations of existing pitch tracks to new pitch candidates,determining a largest of the transition probabilities, andforming associations between the existing pitch tracks and the new pitch candidates according to the largest of the transition probabilities;
  
  generating a speech model and one or more noise models based on the tracked pitch sources; and
  
  performing noise reduction on the sub-band signals based on the speech model and one or more noise models.
- View Dependent Claims (18, 19, 20)
- - 18. The non-transitory computer readable storage medium of claim 17, wherein tracking includes tracking multiple pitch sources across successive frames of the sub-band signals.
  - 19. The non-transitory computer readable storage medium of claim 17, wherein a noise model is not generated for a sub-band in a current frame when speech is dominant in a previous frame for the sub-band or the noise model is not generated for a sub-band in a current frame when speech is dominant in the current frame for the sub-band.
  - 20. The non-transitory computer readable storage medium of claim 17, wherein performing noise reduction includes applying a first-order filter to each sub-band signal.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Samsung Electronics Co. Ltd.
Original Assignee
Audience Corporation
Inventors
Avendano, Carlos, Laroche, Jean, Goodwin, Michael M., Solbach, Ludger
Primary Examiner(s)
Godbold, Douglas

Application Number

US12/860,043
Publication Number

US 20120010881A1
Time in Patent Office

1,005 Days
Field of Search

704224-230
US Class Current

704/226
CPC Class Codes

G10L 21/0208 Noise filtering

G10L 21/0272 Voice signal separating

Monaural noise suppression based on computational auditory scene analysis

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Monaural noise suppression based on computational auditory scene analysis

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links