Multichannel acoustic echo cancellation
First Claim
1. A computer-implemented method for cancelling an echo from an audio signal to isolate received speech, the method comprising:
- sending first playback audio data to a first wireless speaker;
receiving first input audio data from a first microphone of a microphone array, the first input audio data including a first representation of audible sound output by the first wireless speaker and speech input;
receiving second input audio data from a second microphone of the microphone array, the second input audio data including a second representation of the audible sound output by the first wireless speaker and the speech input;
determining a first portion of combined input audio data, the combined input audio data comprising at least the first input audio data and the second input audio data, the first portion of the combined input audio data comprising a first portion of the first input audio data corresponding to a first direction and a first portion of the second input audio data corresponding to the first direction;
determining a second portion of the combined input audio data, the second portion of the combined input audio data comprising a second portion of the first input audio data corresponding to a second direction and a second portion of the second input audio data corresponding to the second direction;
selecting at least the first portion of the combined input audio data as a first target signal on which to perform echo cancellation;
generating a first reference signal using the first playback audio data;
removing the first reference signal from the first target signal to generate a first output audio signal that includes the speech input;
selecting at least the first portion of the combined input audio data as a second target signal on which to perform echo cancellation;
generating a second reference signal using the second portion of the combined input audio data;
removing the second reference signal from the second target signal to generate a second output audio signal that includes the speech input;
performing speech recognition processing on one of the first output audio signal or the second output audio signal to determine a command; and
executing the command.
1 Assignment
0 Petitions
Accused Products
Abstract
An echo cancellation system performs audio beamforming to separate audio input into multiple directions (e.g., target signals) and generates multiple audio outputs using two acoustic echo cancellation (AEC) circuits. A first AEC removes a playback reference signal (generated from a signal sent a loudspeaker) to isolate speech included in the target signals. A second AEC removes an adaptive reference signal (generated from microphone inputs corresponding to audio received from the loudspeaker) to isolate speech included in the target signals. A beam selector receives the multiple audio outputs and selects the first AEC or the second AEC based on a linearity of the system. When linear (e.g., no distortion or variable delay between microphone input and playback signal), the beam selector selects an output from the first AEC based on signal to noise (SNR) ratios. When nonlinear, the beam selector selects an output from the second AEC.
-
Citations
20 Claims
-
1. A computer-implemented method for cancelling an echo from an audio signal to isolate received speech, the method comprising:
-
sending first playback audio data to a first wireless speaker; receiving first input audio data from a first microphone of a microphone array, the first input audio data including a first representation of audible sound output by the first wireless speaker and speech input; receiving second input audio data from a second microphone of the microphone array, the second input audio data including a second representation of the audible sound output by the first wireless speaker and the speech input; determining a first portion of combined input audio data, the combined input audio data comprising at least the first input audio data and the second input audio data, the first portion of the combined input audio data comprising a first portion of the first input audio data corresponding to a first direction and a first portion of the second input audio data corresponding to the first direction; determining a second portion of the combined input audio data, the second portion of the combined input audio data comprising a second portion of the first input audio data corresponding to a second direction and a second portion of the second input audio data corresponding to the second direction; selecting at least the first portion of the combined input audio data as a first target signal on which to perform echo cancellation; generating a first reference signal using the first playback audio data; removing the first reference signal from the first target signal to generate a first output audio signal that includes the speech input; selecting at least the first portion of the combined input audio data as a second target signal on which to perform echo cancellation; generating a second reference signal using the second portion of the combined input audio data; removing the second reference signal from the second target signal to generate a second output audio signal that includes the speech input; performing speech recognition processing on one of the first output audio signal or the second output audio signal to determine a command; and executing the command. - View Dependent Claims (2, 3, 4)
-
-
5. A computer-implemented method, comprising:
-
sending first playback audio data to a first wireless speaker; receiving first input audio data from a first microphone of a microphone array, the first input audio data including a first representation of sound output by the first wireless speaker and speech input; receiving second input audio data from a second microphone of the microphone array, the second input audio data including a second representation of the audible sound output by the first wireless speaker and the speech input; determining a first portion of combined input audio data, the combined input audio data comprising at least the first input audio data and the second input audio data, the first portion of the combined input audio data comprising a first portion of the first input audio data corresponding to a first direction and a first portion of the second input audio data corresponding to the first direction; determining a second portion of the combined input audio data, the second portion of the combined input audio data comprising a second portion of the first input audio data corresponding to a second direction and a second portion of the second input audio data corresponding to the second direction; selecting at least the first portion of the combined input audio data as a first target signal on which to perform echo cancellation; generating a first reference signal using the first playback audio data; removing the first reference signal from the first target signal to generate first output audio data that includes the speech input; selecting at least the first portion of the combined input audio data as a second target signal; generating a second reference signal using the second portion of the combined input audio data; removing the second reference signal from the second target signal to generate second output audio data that includes the speech input; and selecting one of the first output audio data or the second output audio data. - View Dependent Claims (6, 7, 8, 9, 10, 11, 12)
-
-
13. A device, comprising:
-
at least one processor; a memory device including instructions operable to be executed by the at least one processor to configure the device to; send first playback audio data to a first wireless speaker; receive first input audio data from a first microphone of a microphone array, the first input audio data including a first representation of sound output by the first wireless speaker and speech input; receive second input audio data from a second microphone of the microphone array, the second input audio data including a second representation of the audible sound output by the first wireless speaker and the speech input; determine a first portion of combined input audio data, the combined input audio data comprising at least the first input audio data and the second input audio data, the first portion of the combined input audio data comprising a first portion of the first input audio data corresponding to a first direction and a first portion of the second input audio data corresponding to the first direction; determine a second portion of the combined input audio data, the second portion of the combined input audio data comprising a second portion of the first input audio data corresponding to a second direction and a second portion of the second input audio data corresponding to the second direction; select at least the first portion of the combined input audio data as a first target signal on which to perform echo cancellation; generate a first reference signal using the first playback audio data; remove the first reference signal from the first target signal to generate first output audio data that includes the speech input; select at least the first portion of the combined input audio data as a second target signal; generate a second reference signal using the second portion of the combined input audio data; remove the second reference signal from the second target signal to generate second output audio data that includes the speech input; and select one of the first output audio data or the second output audio data. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
Specification