Audio echo cancellation with robust double-talk detection in a conferencing environment
First Claim
1. A method of preventing false positives by a double-talk detection unit at a conferencing endpoint, the method comprising:
- receiving a first signal;
determining an energy value of the first signal;
emitting audio at a loudspeaker, the audio based on the first signal;
collecting audio at a first microphone, the audio including a first linear component corresponding to the first signal, and a first non-linear component corresponding to distortion of the first signal within the emitted audio;
emitting, by the first microphone, a first microphone signal, the first microphone signal comprising a first linear portion corresponding to the first linear component of the collected audio and a non-linear portion corresponding to the first non-linear component of the collected audio;
determining an energy value associated with the non-linear portion of the first microphone signal;
transmitting an energy signal to a double-talk detection unit of a second microphone, the energy signal corresponding to the energy value of the non-linear portion of the first microphone signal multiplied by a scaling factor;
capturing audio at the second microphone, the audio including a second linear component corresponding to the first signal, and a second non-linear component corresponding to distortion of the first signal within the emitted audio, wherein the second linear component is attenuated relative the first linear component, and the second non-linear component is attenuated relative the first non-linear component;
determining an energy value of the audio captured at the second microphone;
receiving the transmitted energy signal at the double-talk detection unit;
calculating, by the double-talk detection unit, a sum of the energy value of the non-linear portion of the first microphone signal multiplied by the scaling factor with the energy value of the first signal; and
comparing, by the double-talk detection unit, the sum with the energy value of the audio captured at the second microphone, whereby the double-talk detection unit is prevented from falsely detecting double-talk.
6 Assignments
0 Petitions
Accused Products
Abstract
A conferencing endpoint includes a loudspeaker, a base microphone, and a double-talk detection module which allows two-way communication between the conferencing endpoint and a remote endpoint only when participants at both endpoints are speaking at the same time, so as to minimize echo due to feedback. The double-talk detection module adds the energy of any distortion from the loudspeaker to the energy of the signal coming from the remote endpoint, and compares this combined energy with the energy of the base microphone to determine whether double-talk is present. The double-talk detection module is thus prevented from mistaking the feedback for near end talk at the endpoint.
-
Citations
21 Claims
-
1. A method of preventing false positives by a double-talk detection unit at a conferencing endpoint, the method comprising:
-
receiving a first signal; determining an energy value of the first signal; emitting audio at a loudspeaker, the audio based on the first signal; collecting audio at a first microphone, the audio including a first linear component corresponding to the first signal, and a first non-linear component corresponding to distortion of the first signal within the emitted audio; emitting, by the first microphone, a first microphone signal, the first microphone signal comprising a first linear portion corresponding to the first linear component of the collected audio and a non-linear portion corresponding to the first non-linear component of the collected audio; determining an energy value associated with the non-linear portion of the first microphone signal; transmitting an energy signal to a double-talk detection unit of a second microphone, the energy signal corresponding to the energy value of the non-linear portion of the first microphone signal multiplied by a scaling factor; capturing audio at the second microphone, the audio including a second linear component corresponding to the first signal, and a second non-linear component corresponding to distortion of the first signal within the emitted audio, wherein the second linear component is attenuated relative the first linear component, and the second non-linear component is attenuated relative the first non-linear component; determining an energy value of the audio captured at the second microphone; receiving the transmitted energy signal at the double-talk detection unit; calculating, by the double-talk detection unit, a sum of the energy value of the non-linear portion of the first microphone signal multiplied by the scaling factor with the energy value of the first signal; and comparing, by the double-talk detection unit, the sum with the energy value of the audio captured at the second microphone, whereby the double-talk detection unit is prevented from falsely detecting double-talk. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A memory storing instructions executable by at least one processor, the instructions comprising instructions to:
-
receive a first signal at an endpoint; determine an energy value of the first signal; emit audio at a loudspeaker, the audio based on the first signal; collect audio at a first microphone, the audio including a first linear component corresponding to the first signal, and a first non-linear component corresponding to distortion of the first signal within the emitted audio; emit, by the first microphone, a first microphone signal, the first microphone signal comprising a first linear portion corresponding to the first linear component of the collected audio and a non-linear portion corresponding to the first non-linear component of the collected audio; determine an energy value associated with the non-linear portion of the first microphone signal; transmit an energy signal to an echo canceller of a second microphone, the energy signal corresponding to the energy value of the non-linear portion of the first microphone signal multiplied by a scaling factor capture audio at the second microphone, the captured audio including a second linear component corresponding to the first signal, and a second non-linear component corresponding to distortion of the first signal within the emitted audio, wherein the second linear component is attenuated relative the first linear component, and the second non-linear component is attenuated relative the first non-linear component; determine an energy value of the audio captured at the second microphone; receive the transmitted energy signal at the echo canceller; determine, at the echo canceller, a sum of the energy value of the non-linear portion of the first microphone signal multiplied by the scaling factor with the energy value of the first signal; determine, at the echo canceller, that the sum exceeds the energy value of the audio captured at the second microphone by a predetermined value; and responsive to the determination that the sum exceeds the energy value of the audio captured at the second microphone by the predetermined value, allow transmission of the audio captured at the second microphone. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A conferencing endpoint, the conferencing endpoint comprising:
-
at least one input, the input configured to receive a first signal, the first signal having an energy value; at least one loudspeaker coupled to the input, the loudspeaker configured to emit audio, the audio based on the first signal; at least one distortion detection module proximate the loudspeaker, the distortion detection module configured to collect audio, the collected audio including a first linear component corresponding to the first signal, and a first non-linear component corresponding to distortion of the first signal within the emitted audio, and further configured to emit a detection signal, the detection signal comprising a first linear portion corresponding to the first linear component of the collected audio and a non-linear portion corresponding to the first non-linear component of the collected audio; at least one microphone configured to capture audio, the captured audio including a second linear component corresponding to the first signal, and a second non-linear component corresponding to distortion of the first signal within the captured audio, wherein the second linear component is attenuated relative the first linear component, and the second non-linear component is attenuated relative the first non-linear component; at least one processing unit coupled to the input, the loudspeaker, the microphone, and the distortion detection module, the processing unit configured to; determine an energy value associated with the non-linear portion of the detection signal; apply a scaling factor to the energy value associated with the non-linear portion of the detection signal; determine a sum of the scaled energy value of the non-linear portion of the detection signal with the energy value of the first signal; compare the sum with an energy value of the captured audio; and transmit the captured audio when the sum exceeds the energy value of the captured audio. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
Specification