Acoustic echo cancellation processing based on feedback from speech recognizer

US 9,373,338 B1
Filed: 06/25/2012
Issued: 06/21/2016
Est. Priority Date: 06/25/2012
Status: Active Grant

First Claim

Patent Images

1. A method of processing signals within an apparatus comprising an acoustic-echo processing module, an automatic speech recognition engine, a loudspeaker and at least one microphone, the method comprising:

receiving, at the automatic speech recognition engine from the acoustic-echo processing module, an acoustic-echo processed signal, wherein the acoustic-echo processed signal is based at least in part upon processing a first signal received at the microphone that comprises a component from the loudspeaker, the processing to cancel at least a portion of the component from the first signal;

determining, by the automatic speech recognition engine, audio content within the acoustic-echo processed signal;

based at least in part upon the audio content determined to be within the acoustic-echo processed signal, producing, by the automatic speech recognition engine, a value indicative of a confidence that one or more spoken commands are present in the audio content;

determining that the value exceeds a threshold value;

providing the value to the acoustic-echo processing module;

adjusting one or more control parameters of the acoustic-echo processing module based on the value; and

processing a subsequent signal received by the acoustic-echo processing module, the processing comprising canceling at least a portion of a component from the loudspeaker corresponding to the subsequent signal.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An automatic speech recognition engine receives an acoustic-echo processed signal from an acoustic-echo processing (AEP) module, where said echo processed signal contains mainly the speech from the near-end talker. The automatic speech recognition engine analyzes the content of the acoustic-echo processed signal to determine whether words or keywords are present. Based upon the results of this analysis, the automatic speech recognition engine produces a value reflecting the likelihood that some words or keywords are detected. Said value is provided to the AEP module. Based upon the value, the AEP module determines if there is double talk and processes the incoming signals accordingly to enhance its performance.

40 Citations

View as Search Results

27 Claims

1. A method of processing signals within an apparatus comprising an acoustic-echo processing module, an automatic speech recognition engine, a loudspeaker and at least one microphone, the method comprising:
- receiving, at the automatic speech recognition engine from the acoustic-echo processing module, an acoustic-echo processed signal, wherein the acoustic-echo processed signal is based at least in part upon processing a first signal received at the microphone that comprises a component from the loudspeaker, the processing to cancel at least a portion of the component from the first signal;
  
  determining, by the automatic speech recognition engine, audio content within the acoustic-echo processed signal;
  
  based at least in part upon the audio content determined to be within the acoustic-echo processed signal, producing, by the automatic speech recognition engine, a value indicative of a confidence that one or more spoken commands are present in the audio content;
  
  determining that the value exceeds a threshold value;
  
  providing the value to the acoustic-echo processing module;
  
  adjusting one or more control parameters of the acoustic-echo processing module based on the value; and
  
  processing a subsequent signal received by the acoustic-echo processing module, the processing comprising canceling at least a portion of a component from the loudspeaker corresponding to the subsequent signal.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, wherein processing of the subsequent signal by the acoustic-echo processing module comprises indicating a confidence that further signals will comprise an element from the user.
  - 3. The method of claim 1, wherein processing of the subsequent signal by the acoustic-echo processing module comprises applying a filter to adjust at least one of a volume and/or a frequency on an audio signal output to the loudspeaker.
  - 4. The method of claim 1, wherein processing of the subsequent signal by the acoustic-echo processing module comprises adjusting a rate of adaptation of an adaptive filter within the acoustic-echo processing module.
  - 5. The method of claim 1, wherein determining, by the automatic speech recognition engine, audio content within the acoustic-echo processed signal comprises determining that one of a plurality of key words is present within the acoustic-echo processed signal.
  - 6. The method of claim 5, wherein processing of the subsequent signal by the acoustic-echo processing module comprises, based upon the determining that one of a plurality of key words is present, indicating a confidence that the subsequent signal includes an element from the user and that further signals will include an element from the user.
  - 7. The method of claim 1, wherein steps of the method are performed repeatedly.
  - 8. The method of claim 1, wherein the value indicates to the acoustic-echo processing module a confidence that future signals will comprise an element from the user.

9. A device comprising:
- an acoustic-echo processing module configured to process a first signal received at a microphone of the device to cancel, at least in part, a component of the first signal corresponding to audio from a loudspeaker of the device, and to output an acoustic-echo processed signal;
  
  an automatic speech recognition engine configured to;
  
  receive the acoustic-echo processed signal from the acoustic-echo processing module,determine content corresponding to audio in the acoustic-echo processed signal,based at least in part upon the content determined to be within the acoustic-echo processed signal, determine a value indicative of a confidence that one or more spoken commands are present in the content,determine that the value exceeds a threshold, andprovide the value to the acoustic-echo processing module;
  
  wherein the acoustic-echo processing module is further configured to;
  
  adjust one or more control parameters of the acoustic-echo processing module based on the value, andadjust a subsequent signal received at the microphone to cancel, at least in part, a component of the subsequent signal corresponding to audio from the loudspeaker of the device.
- View Dependent Claims (10, 11, 12, 13, 14, 15)
- - 10. The device of claim 9, wherein the value indicates to the acoustic-echo processing module a confidence that future signals will comprise an element from a user.
  - 11. The device of claim 9, wherein the acoustic-echo processing module is configured to adjust the subsequent signal by applying a filter to adjust at least one of a volume and/or a frequency on an audio signal output to the loudspeaker.
  - 12. The device of claim 9, wherein the acoustic-echo processing module is configured to adjust the subsequent signal by adjusting a rate of adaptation of an adaptive filter within the acoustic-echo processing module.
  - 13. The device of claim 9, wherein the automatic speech recognition engine is configured to determine content corresponding to audio in the acoustic-echo processed signal by determining that one of a plurality of key words is present within the acoustic-echo processed signal.
  - 14. The device of claim 13, wherein the automatic speech recognition engine is further configured to, based upon determination that one of a plurality of key words is present, produce the value to thereby indicate a confidence that the subsequent signal includes an element from a user and that further signals will include an element from the user.
  - 15. The device of claim 9, wherein the device is included within a communication device.

16. An apparatus comprising:
- one or more processors; and
  
  memory accessible by the one or more processors, the memory including instructions that, when executed, cause the one or more processors to;
  
  determine audio content within an acoustic-echo processed signal, wherein the acoustic-echo processed signal is at least based upon processing a first signal received at a microphone that comprises a component from a loudspeaker, the processing to cancel at least a portion of the component from the first signal,based at least in part upon the audio content determined to be within the acoustic-echo processed signal, produce a value indicative of a confidence that one or more spoken commands are present in the audio content,determine that the value exceeds a threshold value,adjust one or more control parameters of the apparatus based on the value, andprocess a subsequent signal received by the microphone, the processing comprising canceling at least a portion of a component from the loudspeaker corresponding to the subsequent signal.
- View Dependent Claims (17, 18, 19, 20, 21)
- - 17. The apparatus of claim 16, wherein the instructions cause the one or more processors to process the subsequent signal by indicating a confidence that future signals will comprise an element from a user.
  - 18. The apparatus of claim 16, wherein the instructions cause the one or more processors to process the subsequent signal by applying a filter to adjust at least one of a volume and/or a frequency of an audio signal output to the loudspeaker.
  - 19. The apparatus of claim 16, wherein the instructions cause the one or more processors to process the subsequent signal by adjusting a rate of adaptation of an adaptive filter within the acoustic-echo processing module.
  - 20. The apparatus of claim 16, wherein the instructions cause the one or more processors to determine content within the acoustic-echo processed signal by determining that one of a plurality of key words is present within the acoustic-echo processed signal.
  - 21. The apparatus of claim 20, wherein the instructions cause the one or more processors to process the subsequent signal by, based upon the determining that one of a plurality of key words is present, indicating a confidence that the subsequent signal includes an element from the user and that future signals will include an element from the user.

22. A communication device comprising:
- a loudspeaker;
  
  a microphone; and
  
  a signal processing module comprising;
  
  an acoustic-echo processing module, wherein the acoustic-echo processing module is configured to process a first signal received at the microphone to cancel, at least in part, a component of the first signal received from a loudspeaker and to output an acoustic-echo processed signal;
  
  an automatic speech recognition engine, wherein the automatic speech recognition engine is configured to;
  
  receive the acoustic-echo processed signal from the acoustic-echo processing module,determine content corresponding to audio in the acoustic-echo processed signal,based at least in part upon the content determined to be within the acoustic-echo processed signal, determine a value indicative of a confidence that one or more spoken commands are present in the content,determine that the value exceeds a threshold value, andprovide the value to the acoustic-echo processing module;
  
  wherein the acoustic-echo processing module is further configured to;
  
  adjust one or more control parameters of the acoustic-echo processing module based on the value, andadjust a subsequent signal received at the microphone to cancel, at least in part, a component of the subsequent signal corresponding to audio from the loudspeaker of the device.
- View Dependent Claims (23, 24, 25, 26, 27)
- - 23. The communication device of claim 22, wherein the value indicates to the acoustic-echo processing module a confidence that future signals will comprise an element from a user.
  - 24. The communication device of claim 22, wherein the acoustic-echo processing module is configured to adjust the subsequent signal by applying a filter to adjust at least one of a volume and/or a frequency on an audio signal output to the loudspeaker.
  - 25. The communication device of claim 22, wherein the acoustic-echo processing module is configured to adjust the subsequent signal by adjusting a rate of adaptation of an adaptive filter within the acoustic-echo processing module.
  - 26. The communication device of claim 22, wherein the automatic speech recognition engine is configured to determine content corresponding to audio in the acoustic-echo processed signal by determining that one of a plurality of key words is present within the acoustic-echo processed signal.
  - 27. The communication device of claim 26, wherein the automatic speech recognition engine is further configured to, based upon determination that one of a plurality of key words is present, produce the value to thereby indicate a confidence that the subsequent signal includes an element from a user and that future signals will include an element from the user.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Original Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Inventors
Gopalan, Ramya, Velusamy, Kavitha, Chu, Wai C., Chhetri, Amit S.
Primary Examiner(s)
Sirjani, Fariba

Application Number

US13/532,649
Time in Patent Office

1,457 Days
Field of Search

704/226
US Class Current

1/1
CPC Class Codes

G10L 15/22   Procedures used during a sp...

G10L 15/26   Speech to text systems G10L...

G10L 21/02   Speech enhancement, e.g. no...

H04M 9/082   using echo cancellers echo ...

Acoustic echo cancellation processing based on feedback from speech recognizer

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

40 Citations

27 Claims

Specification

Solutions

Use Cases

Quick Links

Acoustic echo cancellation processing based on feedback from speech recognizer

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

40 Citations

27 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links