Reference signal generation for acoustic echo cancellation

US 9,978,387 B1
Filed: 08/05/2013
Issued: 05/22/2018
Est. Priority Date: 08/05/2013
Status: Active Grant

First Claim

Patent Images

1. A speech interface system, comprising:

a housing;

a speaker positioned at least partly within the housing and configured to generate output audio based at least in part on an output audio signal;

one or more input microphones positioned to produce an input audio signal, the input audio signal representing user speech and one or more echoed components of the output audio from the speaker;

a reference microphone positioned within a compartment of the housing, the reference microphone to produce a reference audio signal, wherein a relative magnitude of the output audio to the user speech is greater in the reference audio signal than in the input audio signal, wherein the compartment is disposed at least partly between the speaker and the one or more input microphones;

an adaptive filter configured to produce an estimated echo signal representing the one or more echoed components of the output audio from the speaker as represented by the input audio signal, based at least in part on the reference audio signal;

a subtraction component configured to subtract the estimated echo signal from the input audio signal to produce an echo-suppressed audio signal; and

one or more speech processing components configured to perform speech recognition on the echo-suppressed audio signal.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An audio device may have an output speaker that produces audio within the environment of a user and one or more input microphones that capture speech and other sounds from the user environment. The audio device may use acoustic echo cancellation (AEC) to suppress echoed components of the speaker output that may be present in audio captured by the input microphones. The AEC may be implemented using an adaptive filter that estimates echoing based on an output reference signal. The output reference signal may be generated by a reference microphone placed near the speaker of the audio device.

Citations

24 Claims

1. A speech interface system, comprising:
- a housing;
  
  a speaker positioned at least partly within the housing and configured to generate output audio based at least in part on an output audio signal;
  
  one or more input microphones positioned to produce an input audio signal, the input audio signal representing user speech and one or more echoed components of the output audio from the speaker;
  
  a reference microphone positioned within a compartment of the housing, the reference microphone to produce a reference audio signal, wherein a relative magnitude of the output audio to the user speech is greater in the reference audio signal than in the input audio signal, wherein the compartment is disposed at least partly between the speaker and the one or more input microphones;
  
  an adaptive filter configured to produce an estimated echo signal representing the one or more echoed components of the output audio from the speaker as represented by the input audio signal, based at least in part on the reference audio signal;
  
  a subtraction component configured to subtract the estimated echo signal from the input audio signal to produce an echo-suppressed audio signal; and
  
  one or more speech processing components configured to perform speech recognition on the echo-suppressed audio signal.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The speech interface platform of claim 1, wherein the one or more input microphones are positioned behind the speaker.
  - 3. The speech interface platform of claim 1, wherein the reference microphone is positioned closer to the speaker than the one or more input microphones.
  - 4. The speech interface platform of claim 1, wherein the reference microphone is responsive to conduction of the output audio through one or more surfaces of the housing to generate the reference audio signal.
  - 5. The speech interface platform of claim 1, further comprising a beam forming component configured to generate a directional input audio signal from the input audio signal.
  - 6. The speech interface platform of claim 1, wherein the adaptive filter is configured to process the input audio signal and the reference audio signal.
  - 7. The speech interface device of claim 1, wherein the reference microphone is positioned within the compartment such that the reference microphone is isolated from the user speech.
  - 8. The speech interface device of claim 1, wherein a relative magnitude of the user speech to the output audio is greater in the input audio signal than in the reference audio signal.
  - 9. The speech interface device of claim 1, further comprising:
    - an additional reference microphone positioned proximate to the speaker to produce an additional reference audio signal, wherein a relative magnitude of the output audio to the user speech is greater in the additional reference audio signal than in the input audio signal;
      
      an additional adaptive filter configured to produce an additional estimated echo signal representing the one or more echoed components of the output audio from the speaker as represented by the input audio signal, based at least in part on the additional reference audio signal;
      
      an additional subtraction component configured to subtract the additional estimated echo signal from the input audio signal to produce an additional echo-suppressed audio signal; and
      
      a summing component configured to generate a summed echo-suppressed audio signal based at least in part on the echo-suppressed audio signal and the additional echo suppressed audio signal.
  - 10. The speech interface device of claim 1, wherein:
    - the one or more microphones are configured to produce one or more audio signals for speech recognition; and
      
      the reference microphone is configured to produce one or more reference audio signals for echo cancellation.
  - 11. The speech interface device of claim 1, wherein:
    - the housing includes at least a first end and a second end;
      
      the speaker is positioned at least partly within the housing such that the speaker includes a directional output pattern directed towards the second end of the housing; and
      
      the reference microphone is disposed near the second end of the housing.

12. An audio device, comprising:
- a housing;
  
  a speaker positioned at least partly within the housing and configured to generate output audio;
  
  one or more input microphones configured to produce an input audio signal, the input audio signal representing user speech and one or more echoed components of the output audio from the speaker;
  
  a reference microphone positioned within a compartment of the housing such that the reference microphone is isolated from the user speech, the reference microphone configured to produce a reference audio signal, wherein a relative magnitude of the output audio to the user speech is greater in the reference audio signal than in the input audio signal, wherein the compartment is disposed at least partly between the speaker and the one or more input microphones;
  
  an adaptive filter configured to produce, based at least in part on the reference audio signal, an estimated echo signal representing the one or more echoed components of the output audio from the speaker as represented by the input audio signal; and
  
  an echo suppression element configured to suppress the one or more echoed components of the output audio as represented by the input audio signal using the estimated echo signal.
- View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20)
- - 13. The audio device of claim 12, wherein the reference microphone is positioned closer to the speaker than the one or more input microphones.
  - 14. The audio device of claim 12, wherein the reference microphone is responsive to conduction of the output audio through one or more surfaces of the housing to generate the reference audio signal.
  - 15. The audio device of claim 12, wherein the compartment includes a sealed chamber of the housing.
  - 16. The audio device of claim 12, further comprising a beam forming component configured to generate a directional input audio signal from the input audio signal.
  - 17. The audio device of claim 12, wherein the reference microphone has a directional sensitivity pattern that is directed toward the speaker.
  - 18. The audio device of claim 12, wherein the echo suppression element is configured to process the input audio signal and the reference audio signal.
  - 19. The audio device of claim 12, wherein the speaker has a directional audio output pattern, the reference microphone is positioned outside of the directional audio output pattern, and the one or more input microphones are positioned outside the directional audio output pattern.
  - 20. The audio device of claim 12, wherein the reference microphone is disposed at least partly on a surface of the housing and is responsive to conduction of the output audio through the surface to generate the reference audio signal.

21. A method, comprising:
- producing, using a speaker located at least partly within a housing, output audio based at least in part on an output audio signal;
  
  receiving an input audio signal from one or more input microphones, wherein the input audio signal represents user speech and one or more components of the output audio from the speaker;
  
  receiving a reference audio signal from a reference microphone that is positioned within a compartment of the housing, wherein a relative magnitude of the output audio to the user speech is greater in the reference audio signal than in the input audio signal, and wherein the compartment is disposed at least partly between the speaker and the one or more input microphones;
  
  generating, based at least in part on the reference audio signal, an estimated echo signal representing the one or more components of the output audio from the speaker as represented by the input audio signal; and
  
  suppressing the one or more components of the output audio as represented by the input audio signal based at least in part on the estimated echo signal.
- View Dependent Claims (22, 23, 24)
- - 22. The method of claim 21, wherein the one or more input microphones are positioned behind the speaker.
  - 23. The method of claim 21, wherein the reference microphone is positioned closer to the speaker than the one or more input microphones.
  - 24. The method of claim 21, wherein the reference microphone is positioned to reduce reception by the reference microphone of audio other than the output audio.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Original Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Inventors
Pogue, Michael Alan, Chhetri, Amit Singh
Primary Examiner(s)
Patel, Shreyans

Application Number

US13/959,377
Time in Patent Office

1,751 Days
Field of Search

704226
US Class Current
CPC Class Codes

G10L 15/00   Speech recognition G10L17/0...

G10L 2021/02082   the noise being echo, rever...

G10L 2021/02165   Two microphones, one receiv...

G10L 21/02   Speech enhancement, e.g. no...

G10L 21/0208   Noise filtering

Reference signal generation for acoustic echo cancellation

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

24 Claims

Specification

Solutions

Use Cases

Quick Links

Reference signal generation for acoustic echo cancellation

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

24 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links