Cross-domain processing for noise and echo suppression

US 9,020,144 B1
Filed: 03/13/2013
Issued: 04/28/2015
Est. Priority Date: 03/13/2013
Status: Active Grant

First Claim

Patent Images

1. A computing device, comprising:

a processor;

a microphone;

a speaker configured to render audio based at least in part on an output audio signal; and

memory, accessible by the processor and storing instructions that are executable by the processor to perform acts comprising;

receiving an input audio signal from the microphone, wherein the input audio signal includes an echo component resulting from the rendered audio;

estimating an echo signal corresponding to the echo component of the input audio signal based at least in part on the output audio signal;

subtracting the estimated echo signal from the input audio signal to produce an echo-suppressed audio signal;

calculating a frequency-domain representation of the echo-suppressed audio signal;

calculating a frequency-domain representation of the estimated echo signal;

estimating noise values corresponding to different frequencies of the echo-suppressed audio signal based at least in part on the calculated frequency-domain representation of the echo-suppressed audio signal;

calculating gain values corresponding respectively to the different frequencies based at least in part on (a) the estimated noise values of the echo-suppressed audio signal and (b) the calculated frequency-domain representation of the estimated echo signal;

adjusting the frequency-domain representation of the echo-suppressed audio signal at the different frequencies in accordance with the calculated gain values corresponding to the different frequencies to produce an adjusted frequency-domain representation of the echo-suppressed audio signal; and

producing a processed audio signal based at least in part on the adjusted frequency-domain representation of the echo-suppressed audio signal.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An audio-based system may perform noise and echo suppression by initially processing an audio signal that is subject to acoustic echo or echo resulting from other system characteristics. The audio signal is processed in the time domain using an adaptive echo-cancellation filter. The audio is then further processed in the frequency domain to simultaneously reduce background noise and residual echo.

Citations

19 Claims

1. A computing device, comprising:
- a processor;
  
  a microphone;
  
  a speaker configured to render audio based at least in part on an output audio signal; and
  
  memory, accessible by the processor and storing instructions that are executable by the processor to perform acts comprising;
  
  receiving an input audio signal from the microphone, wherein the input audio signal includes an echo component resulting from the rendered audio;
  
  estimating an echo signal corresponding to the echo component of the input audio signal based at least in part on the output audio signal;
  
  subtracting the estimated echo signal from the input audio signal to produce an echo-suppressed audio signal;
  
  calculating a frequency-domain representation of the echo-suppressed audio signal;
  
  calculating a frequency-domain representation of the estimated echo signal;
  
  estimating noise values corresponding to different frequencies of the echo-suppressed audio signal based at least in part on the calculated frequency-domain representation of the echo-suppressed audio signal;
  
  calculating gain values corresponding respectively to the different frequencies based at least in part on (a) the estimated noise values of the echo-suppressed audio signal and (b) the calculated frequency-domain representation of the estimated echo signal;
  
  adjusting the frequency-domain representation of the echo-suppressed audio signal at the different frequencies in accordance with the calculated gain values corresponding to the different frequencies to produce an adjusted frequency-domain representation of the echo-suppressed audio signal; and
  
  producing a processed audio signal based at least in part on the adjusted frequency-domain representation of the echo-suppressed audio signal.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The computing device of claim 1, the acts further comprising:
    - calculating signal-to-noise values corresponding respectively to the different frequencies based at least in part on the calculated frequency-domain representation of the echo-suppressed audio signal and the estimated noise values of the echo-suppressed audio signal;
      
      calculating signal-to-echo values corresponding respectively to the different frequencies based at least in part on the calculated frequency-domain representation of the estimated echo signal and the calculated frequency-domain representation of the echo-suppressed audio signal; and
      
      wherein calculating the gain values is based at least in part on (a) the calculated signal-to-noise values and (b) the calculated signal-to-echo values.
  - 3. The computing device of claim 1, wherein calculating the gain values comprises:
    - calculating a first set of gain values corresponding respectively to the different frequencies based at least in part on the estimated noise values and the calculated frequency-domain representation of the echo-suppressed audio signal;
      
      calculating a second set of gain values corresponding respectively to the different frequencies based at least in part on the calculated frequency-domain representation of the estimated echo signal and the calculated frequency-domain representation of the echo-suppressed audio signal; and
      
      summing the first and second sets of gain values at each of the different frequencies.
  - 4. The computing device of claim 1, wherein estimating the echo signal comprises filtering the output audio signal with an adaptive finite impulse response filter, wherein the adaptive finite impulse response filter has filter coefficients that are dynamically adjusted based at least in part on the echo-suppressed audio signal.
  - 5. The computing device of claim 1, wherein:
    - the calculated frequency-domain representation of the echo-suppressed audio signal indicates values of the echo-suppressed audio signal corresponding to the different frequencies; and
      
      the calculated frequency-domain representation of the estimated echo signal indicates values of the estimated audio signal corresponding respectively to the different frequencies.

6. A method, comprising:
- estimating an echo signal corresponding to an echo component of an input audio signal;
  
  subtracting the estimated echo signal from the input audio signal to produce an echo-suppressed audio signal;
  
  calculating a frequency-domain representation of the echo-suppressed audio signal;
  
  calculating a frequency-domain representation of the estimated echo signal;
  
  estimating noise values corresponding to different frequencies of the echo-suppressed audio signal based at least in part on the calculated frequency-domain representation of the echo-suppressed audio signal;
  
  calculating gain values corresponding respectively to the different frequencies based at least in part on (a) the estimated noise values of the echo-suppressed audio signal and (b) the calculated frequency-domain representation of the estimated echo;
  
  processing the frequency-domain representation of the echo-suppressed audio signal at the different frequencies in accordance with the calculated gain values to reduce noise and residual echo components of the echo-suppressed audio signal; and
  
  producing a transmit audio signal based at least in part on the processed frequency-domain representation of the echo-suppressed audio signal.
- View Dependent Claims (7, 8, 9, 10, 11, 12)
- - 7. The method of claim 6, wherein estimating the echo signal comprises filtering a received output audio signal with an adaptive finite impulse response filter, wherein the adaptive finite impulse response filter has filter coefficients that are dynamically adjusted based at least in part on the echo-suppressed audio signal.
  - 8. The method of claim 6, wherein:
    - the frequency-domain representation of the estimated echo comprises echo values corresponding respectively to the different frequencies; and
      
      calculating the gain values comprises summing the estimated noise values and the echo values in proportion to relative contributions of near-end audio and far-end audio to interference in the echo-suppressed audio signal.
  - 9. The method of claim 6, wherein the processing comprises:
    - calculating signal-to-noise values corresponding respectively to the different frequencies based at least in part on the calculated frequency-domain representation of the echo-suppressed audio signal and the estimated noise values of the echo-suppressed audio signal; and
      
      calculating signal-to-echo values corresponding respectively to the different frequencies based at least in part on the calculated frequency-domain representation of the estimated echo signal and the calculated frequency-domain representation of the echo-suppressed audio signal;
      
      wherein the calculating gain values corresponding respectively to the different frequencies is further based at least in part on (a) the calculated signal-to-noise values and (b) the calculated signal-to-echo values.
  - 10. The method of claim 9, wherein calculating the gain values comprises summing the calculated signal-to-noise values and the calculated signal-to-echo values at each of the different frequencies in proportion to relative contributions of near-end and far-end audio to interference in the echo-suppressed audio signal.
  - 11. The method of claim 6, wherein calculating the gain values comprises:
    - calculating a first set of gain values corresponding respectively to the different frequencies based at least in part on the estimated noise values and the calculated frequency-domain representation of the echo-suppressed audio signal; and
      
      calculating a second set of gain values corresponding respectively to the different frequencies based at least in part on the calculated frequency-domain representation of the estimated echo signal and the calculated frequency-domain representation of the echo-suppressed audio signal.
  - 12. The method of claim 11, wherein calculating the gain values comprises summing the first and second sets of gain values at each of the different frequencies based in proportion to relative contributions of near-end and far-end audio to interference in the echo-suppressed audio signal.

13. One or more non-transitory computer-readable media storing computer-executable instructions that, when executed by one or more processors, cause the one or more processors to perform acts comprising:
- processing an audio signal by linear adaptive filtering to reduce an echo component of the audio signal; and
  
  processing the audio signal in a frequency domain by applying gain values to a frequency-domain representation of the audio signal, wherein the gain values are based at least in part noise components of the audio signal over a spectrum of frequencies and estimated echo values of the audio signal over the spectrum of frequencies, the processing the audio signals in the frequency domain to (a) reduce noise in the audio signal and (b) further reduce the echo component in the audio signal.
- View Dependent Claims (14, 15, 16, 17, 18, 19)
- - 14. The one or more non-transitory computer-readable media of claim 13, wherein processing the audio signal comprises:
    - applying an adaptive filter to the audio signal to estimate the echo component; and
      
      subtracting the echo component from the audio signal.
  - 15. The one or more non-transitory computer-readable media of claim 13, wherein processing the audio signal in the frequency domain is based at least in part on estimating noise components of the audio signal over a spectrum of frequencies.
  - 16. The one or more non-transitory computer-readable media of claim 13, wherein processing the audio signal in the frequency domain is based at least in part on evaluating power values of the audio signal over a spectrum of frequencies.
  - 17. The one or more non-transitory computer-readable media of claim 13, wherein processing the audio signal in the frequency domain is based at least in part on estimating values of the echo component over a spectrum of frequencies.
  - 18. The one or more non-transitory computer-readable media of claim 13, wherein processing the audio signal in the frequency domain is based at least in part on evaluating noise-to-signal values of the audio signal over a spectrum of frequencies.
  - 19. The one or more non-transitory computer-readable media of claim 13, wherein processing the audio signal in the frequency domain is based at least in part on evaluating signal-to-echo values of the audio signal and the echo component over a spectrum of frequencies.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Original Assignee
Rawles, LLC (Amazon.com, Inc.)
Inventors
Yang, Jun
Primary Examiner(s)
Jamal, Alexander

Application Number

US13/801,714
Time in Patent Office

776 Days
Field of Search

379/406.05, 37940612-40614
US Class Current

379/406.05
CPC Class Codes

H04M 9/082 using echo cancellers echo ...

Cross-domain processing for noise and echo suppression

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

19 Claims

Specification

Solutions

Use Cases

Quick Links

Cross-domain processing for noise and echo suppression

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

19 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links