Acoustic echo cancellation processing based on feedback from speech recognizer
First Claim
1. A method of processing signals within an apparatus comprising an acoustic-echo processing module, an automatic speech recognition engine, a loudspeaker and at least one microphone, the method comprising:
- receiving, at the automatic speech recognition engine from the acoustic-echo processing module, an acoustic-echo processed signal, wherein the acoustic-echo processed signal is based at least in part upon processing a first signal received at the microphone that comprises a component from the loudspeaker, the processing to cancel at least a portion of the component from the first signal;
determining, by the automatic speech recognition engine, audio content within the acoustic-echo processed signal;
based at least in part upon the audio content determined to be within the acoustic-echo processed signal, producing, by the automatic speech recognition engine, a value indicative of a confidence that one or more spoken commands are present in the audio content;
determining that the value exceeds a threshold value;
providing the value to the acoustic-echo processing module;
adjusting one or more control parameters of the acoustic-echo processing module based on the value; and
processing a subsequent signal received by the acoustic-echo processing module, the processing comprising canceling at least a portion of a component from the loudspeaker corresponding to the subsequent signal.
2 Assignments
0 Petitions
Accused Products
Abstract
An automatic speech recognition engine receives an acoustic-echo processed signal from an acoustic-echo processing (AEP) module, where said echo processed signal contains mainly the speech from the near-end talker. The automatic speech recognition engine analyzes the content of the acoustic-echo processed signal to determine whether words or keywords are present. Based upon the results of this analysis, the automatic speech recognition engine produces a value reflecting the likelihood that some words or keywords are detected. Said value is provided to the AEP module. Based upon the value, the AEP module determines if there is double talk and processes the incoming signals accordingly to enhance its performance.
40 Citations
27 Claims
-
1. A method of processing signals within an apparatus comprising an acoustic-echo processing module, an automatic speech recognition engine, a loudspeaker and at least one microphone, the method comprising:
-
receiving, at the automatic speech recognition engine from the acoustic-echo processing module, an acoustic-echo processed signal, wherein the acoustic-echo processed signal is based at least in part upon processing a first signal received at the microphone that comprises a component from the loudspeaker, the processing to cancel at least a portion of the component from the first signal; determining, by the automatic speech recognition engine, audio content within the acoustic-echo processed signal; based at least in part upon the audio content determined to be within the acoustic-echo processed signal, producing, by the automatic speech recognition engine, a value indicative of a confidence that one or more spoken commands are present in the audio content; determining that the value exceeds a threshold value; providing the value to the acoustic-echo processing module; adjusting one or more control parameters of the acoustic-echo processing module based on the value; and processing a subsequent signal received by the acoustic-echo processing module, the processing comprising canceling at least a portion of a component from the loudspeaker corresponding to the subsequent signal. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A device comprising:
-
an acoustic-echo processing module configured to process a first signal received at a microphone of the device to cancel, at least in part, a component of the first signal corresponding to audio from a loudspeaker of the device, and to output an acoustic-echo processed signal; an automatic speech recognition engine configured to; receive the acoustic-echo processed signal from the acoustic-echo processing module, determine content corresponding to audio in the acoustic-echo processed signal, based at least in part upon the content determined to be within the acoustic-echo processed signal, determine a value indicative of a confidence that one or more spoken commands are present in the content, determine that the value exceeds a threshold, and provide the value to the acoustic-echo processing module; wherein the acoustic-echo processing module is further configured to; adjust one or more control parameters of the acoustic-echo processing module based on the value, and adjust a subsequent signal received at the microphone to cancel, at least in part, a component of the subsequent signal corresponding to audio from the loudspeaker of the device. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
16. An apparatus comprising:
one or more processors; and memory accessible by the one or more processors, the memory including instructions that, when executed, cause the one or more processors to; determine audio content within an acoustic-echo processed signal, wherein the acoustic-echo processed signal is at least based upon processing a first signal received at a microphone that comprises a component from a loudspeaker, the processing to cancel at least a portion of the component from the first signal, based at least in part upon the audio content determined to be within the acoustic-echo processed signal, produce a value indicative of a confidence that one or more spoken commands are present in the audio content, determine that the value exceeds a threshold value, adjust one or more control parameters of the apparatus based on the value, and process a subsequent signal received by the microphone, the processing comprising canceling at least a portion of a component from the loudspeaker corresponding to the subsequent signal. - View Dependent Claims (17, 18, 19, 20, 21)
-
22. A communication device comprising:
-
a loudspeaker; a microphone; and a signal processing module comprising; an acoustic-echo processing module, wherein the acoustic-echo processing module is configured to process a first signal received at the microphone to cancel, at least in part, a component of the first signal received from a loudspeaker and to output an acoustic-echo processed signal; an automatic speech recognition engine, wherein the automatic speech recognition engine is configured to; receive the acoustic-echo processed signal from the acoustic-echo processing module, determine content corresponding to audio in the acoustic-echo processed signal, based at least in part upon the content determined to be within the acoustic-echo processed signal, determine a value indicative of a confidence that one or more spoken commands are present in the content, determine that the value exceeds a threshold value, and provide the value to the acoustic-echo processing module; wherein the acoustic-echo processing module is further configured to; adjust one or more control parameters of the acoustic-echo processing module based on the value, and adjust a subsequent signal received at the microphone to cancel, at least in part, a component of the subsequent signal corresponding to audio from the loudspeaker of the device. - View Dependent Claims (23, 24, 25, 26, 27)
-
Specification