Adaptive ambient sound suppression and speech tracking
First Claim
1. A computing device configured to receive speech inputs, the computing device comprising:
- a microphone array having a plurality of microphones;
a processor in operative communication with the microphone array;
an analog-to-digital converter in operative communication with the microphone array and with the processor; and
memory comprising instructions stored therein that are executable by the processor to;
receive a plurality of digital sound signals from the analog-to-digital converter, each digital sound signal being based on an analog sound signal originating at the microphone array,receive a multi-channel speaker signal from a speaker signal source,for each digital sound signal, generate a monophonic approximation signal of the multi-channel speaker signal that approximates speaker sounds as received by the corresponding microphone,apply a linear acoustic echo canceller to suppress a first ambient sound portion of each digital sound signal based at least in part on the monophonic approximation signal,generate a combined directionally-adaptive sound signal from a combination of each digital sound signal based at least in part on a combination of time-invariant and adaptive beamforming techniques, andapply one or more nonlinear noise suppression techniques to suppress a second ambient sound portion of the combined directionally-adaptive sound signal based at least in part on a directional characteristic of the combined directionally-adaptive sound signal.
2 Assignments
0 Petitions
Accused Products
Abstract
A device for suppressing ambient sounds from speech received by a microphone array is provided. One embodiment of the device comprises a microphone array, a processor, an analog-to-digital converter, and memory comprising instructions stored therein that are executable by the processor. The instructions stored in the memory are configured to receive a plurality of digital sound signals, each digital sound signal based on an analog sound signal originating at the microphone array, receive a multi-channel speaker signal, generate a monophonic approximation signal of the multi-channel speaker signal, apply a linear acoustic echo canceller to suppress a first ambient sound portion of each digital sound signal, generate a combined directionally-adaptive sound signal from a combination of each digital sound signal by a combination of time-invariant and adaptive beamforming techniques, and apply one or more nonlinear noise suppression techniques to suppress a second ambient sound portion of the combined directionally-adaptive sound signal.
-
Citations
20 Claims
-
1. A computing device configured to receive speech inputs, the computing device comprising:
-
a microphone array having a plurality of microphones; a processor in operative communication with the microphone array; an analog-to-digital converter in operative communication with the microphone array and with the processor; and memory comprising instructions stored therein that are executable by the processor to; receive a plurality of digital sound signals from the analog-to-digital converter, each digital sound signal being based on an analog sound signal originating at the microphone array, receive a multi-channel speaker signal from a speaker signal source, for each digital sound signal, generate a monophonic approximation signal of the multi-channel speaker signal that approximates speaker sounds as received by the corresponding microphone, apply a linear acoustic echo canceller to suppress a first ambient sound portion of each digital sound signal based at least in part on the monophonic approximation signal, generate a combined directionally-adaptive sound signal from a combination of each digital sound signal based at least in part on a combination of time-invariant and adaptive beamforming techniques, and apply one or more nonlinear noise suppression techniques to suppress a second ambient sound portion of the combined directionally-adaptive sound signal based at least in part on a directional characteristic of the combined directionally-adaptive sound signal. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method for suppressing ambient sounds from speech received by a microphone array, comprising, at memory including instructions stored therein that are executable by a processor:
-
receiving a plurality of digital sound signals from an analog-to-digital converter, each digital sound signal based on an analog sound signal originating at the microphone array; receiving a multi-channel speaker signal from a speaker signal source; generating a monophonic approximation signal of the multi-channel speaker signal for each digital sound signal that approximates speaker sounds as received by the corresponding microphone; applying a linear acoustic echo canceller to suppress a first ambient sound portion of each digital sound signal based at least in part on the monophonic approximation signal; generating a combined directionally-adaptive sound signal from a combination of each digital sound signal based at least in part on a combination of time-invariant and adaptive beamforming techniques for tracking a speech source; applying one or more nonlinear noise suppression techniques to suppress a second ambient sound portion of the combined directionally-adaptive sound signal based at least in part on a directional characteristic of the combined directionally-adaptive sound signal; and outputting a resulting sound signal. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17)
-
-
18. A method for suppressing ambient sounds from speech received by a microphone array, at memory including instructions stored therein that are executable by a processor:
-
receiving an analog sound signal generated at each microphone of a microphone array comprising a plurality of microphones, each analog sound signal being separately received at least in part from a speech source; converting each analog sound signal to a corresponding first digital sound signal having a first, higher bit depth at an analog-to-digital converter; receiving a multi-channel speaker signal for a plurality of speakers from a speaker signal source; synchronizing the multi-channel speaker signal to each first digital sound signal via a clock signal received from a remote computing device; determining a calibration signal for each microphone by emitting a calibration audio signal from each of the plurality of speakers; detecting the calibration audio signal at each microphone of the microphone array; generating a monophonic approximation signal of the multi-channel speaker signal for each first digital sound signal that approximates speaker sounds as received by the corresponding microphone based at least in part on the calibration signal for each microphone; applying a linear acoustic echo canceller to suppress a first ambient sound portion of each first digital sound signal based at least in part on the monophonic approximation signal; converting each first digital sound signal to a second digital sound signal having a second, lower bit depth after applying the linear acoustic echo canceller to each digital sound signal; applying a linear stationary tone remover to each second digital sound signal; generating a combined directionally-adaptive sound signal from a combination of each second digital sound signal by applying a series of predetermined weighting coefficients to each second digital sound signal, each predetermined weighting coefficient being calculated based at least in part on an isotropic ambient noise distribution within a predefined sound reception zone of the microphone array, and by applying a sound source localizer to determine a reception angle of the speech source with respect to the microphone array and to track the speech source based at least in part on the reception angle as the speech source moves in real time; applying one or more nonlinear noise suppression techniques to suppress a second ambient sound portion of the combined directionally-adaptive sound signal based at least in part on a directional characteristic of the combined directionally-adaptive sound signal; and outputting a resulting sound signal. - View Dependent Claims (19, 20)
-
Specification