Adaptive beamforming to create reference channels
First Claim
1. A computer-implemented method for cancelling an echo from an audio signal to isolate received speech, the method comprising:
- sending a first output audio signal to a first wireless speaker;
receiving a first input audio signal from a first microphone of a microphone array, the first input audio signal including a first representation of audible sound output by the first wireless speaker and a first representation of speech input;
receiving a second input audio signal from a second microphone of the microphone array, the second input audio signal including a second representation of the audible sound output by the first wireless speaker and a second representation of the speech input;
performing first audio beamforming to determine a first portion of combined input audio data comprising a first portion of the first input audio signal corresponding to a first direction and a first portion of the second input audio signal corresponding to the first direction;
performing second audio beamforming to determine a second portion of the combined input audio data comprising a second portion of the first input audio signal corresponding to a second direction and a second portion of the second input audio signal corresponding to the second direction;
selecting at least the first portion as a target signal on which to perform echo cancellation;
selecting at least the second portion as a reference signal to remove from the target signal;
removing the reference signal from the target signal to generate a second output audio signal including a third representation of the speech input;
performing speech recognition processing on the second output audio signal to determine a command; and
executing the command.
1 Assignment
0 Petitions
Accused Products
Abstract
An echo cancellation system that performs audio beamforming to separate audio input into multiple directions and determines a target signal and a reference signal from the multiple directions. For example, the system may detect a strong signal associated with a speaker and select the strong signal as a reference signal, selecting another direction as a target signal. The system may determine a speech position and may select the speech position as a target signal and an opposite direction as a reference signal. The system may create pairwise combinations of opposite directions, with an individual direction being selected as a target signal and a reference signal. The system may select a fixed beamformer output for the target signal and an adaptive beamformer output for the reference signal, or vice versa. The system may remove the reference signal (e.g., audio output by the loudspeaker) to isolate speech included in the target signal.
-
Citations
20 Claims
-
1. A computer-implemented method for cancelling an echo from an audio signal to isolate received speech, the method comprising:
-
sending a first output audio signal to a first wireless speaker; receiving a first input audio signal from a first microphone of a microphone array, the first input audio signal including a first representation of audible sound output by the first wireless speaker and a first representation of speech input; receiving a second input audio signal from a second microphone of the microphone array, the second input audio signal including a second representation of the audible sound output by the first wireless speaker and a second representation of the speech input; performing first audio beamforming to determine a first portion of combined input audio data comprising a first portion of the first input audio signal corresponding to a first direction and a first portion of the second input audio signal corresponding to the first direction; performing second audio beamforming to determine a second portion of the combined input audio data comprising a second portion of the first input audio signal corresponding to a second direction and a second portion of the second input audio signal corresponding to the second direction; selecting at least the first portion as a target signal on which to perform echo cancellation; selecting at least the second portion as a reference signal to remove from the target signal; removing the reference signal from the target signal to generate a second output audio signal including a third representation of the speech input; performing speech recognition processing on the second output audio signal to determine a command; and executing the command. - View Dependent Claims (2, 3, 4)
-
-
5. A computer-implemented method, comprising:
-
receiving first input audio data from a first microphone of a microphone array, the first input audio data including a first representation of sound output by a first wireless speaker and a first representation of speech input; receiving second input audio data from a second microphone of the microphone array, the second input audio data including a second representation of the audible sound output by the first wireless speaker and a second representation of the speech input; performing first audio beamforming to determine a first portion of combined input audio data comprising a first portion of the first input audio signal corresponding to a first direction and a first portion of the second input audio signal corresponding to the first direction; performing second audio beamforming to determine a second portion of the combined input audio data comprising a second portion of the first input audio signal corresponding to a second direction and a second portion of the second input audio signal corresponding to the second direction; selecting at least the first portion as a target signal; selecting at least the second portion as a reference signal; and removing the reference signal from the target signal to generate first output audio data including a third representation of the speech input. - View Dependent Claims (6, 7, 8, 9, 10, 11, 12)
-
-
13. A device, comprising:
-
at least one processor; a memory device including instructions operable to be executed by the at least one processor to configure the device to; receive first input audio data from a first microphone of a microphone array, the first input audio data including a first representation of sound output by a first wireless speaker and a first representation of speech input; receive second input audio data from a second microphone of the microphone array, the second input audio data including a second representation of the audible sound output by the first wireless speaker and a second representation of the speech input; perform first audio beamforming to determine a first portion of combined input audio data comprising a first portion of the first input audio signal corresponding to a first direction and a first portion of the second input audio signal corresponding to the first direction; perform second audio beamforming to determine a second portion of the combined input audio data comprising a second portion of the first input audio signal corresponding to a second direction and a second portion of the second input audio signal corresponding to the second direction; select at least the first portion as a target signal; select at least the second portion as a reference signal; and remove the reference signal from the target signal to generate first output audio data including a third representation of the speech input. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
Specification