Multiple-stage adaptive filtering of audio signals
First Claim
1. A system comprising:
- memory;
one or more processors; and
one or more computer-executable instructions stored in the memory and executable by the one or more processors to;
cause a first microphone to detect a target voice associated with a user within an environment and to cause a second microphone to detect noise within the environment;
implement a delay with respect to a first audio signal that represents the noise and refrain from delaying a second audio signal that represents the target voice;
terminate the delay based at least in part on detecting the noise;
process, by a first adaptive filter, the target voice to generate a target voice estimate, the target voice estimate representing a first estimate of the target voice of the user;
process, by the first adaptive filter, the noise to generate a noise estimate, the noise estimate representing a second estimate of the noise within the environment; and
generate, by a second adaptive filter different from the first adaptive filter, an enhanced target voice based at least in part on the target voice estimate and the noise estimate, and based at least in part on a suppression of the noise.
2 Assignments
0 Petitions
Accused Products
Abstract
The systems, devices, and processes described herein may include a first microphone that detects a target voice of a user within an environment and a second microphone that detects other noise within the environment. A target voice estimate and/or a noise estimate may be generated based at least in part on one or more adaptive filters. Based at least in part on the voice estimate and/or the noise estimate, an enhanced target voice and an enhanced interference, respectively, may be determined. One or more words that correspond to the target voice may be determined based at least in part on the enhanced target voice and/or the enhanced interference. In some instances, the one or more words may be determined by suppressing or canceling the detected noise.
188 Citations
20 Claims
-
1. A system comprising:
-
memory; one or more processors; and one or more computer-executable instructions stored in the memory and executable by the one or more processors to; cause a first microphone to detect a target voice associated with a user within an environment and to cause a second microphone to detect noise within the environment; implement a delay with respect to a first audio signal that represents the noise and refrain from delaying a second audio signal that represents the target voice; terminate the delay based at least in part on detecting the noise; process, by a first adaptive filter, the target voice to generate a target voice estimate, the target voice estimate representing a first estimate of the target voice of the user; process, by the first adaptive filter, the noise to generate a noise estimate, the noise estimate representing a second estimate of the noise within the environment; and generate, by a second adaptive filter different from the first adaptive filter, an enhanced target voice based at least in part on the target voice estimate and the noise estimate, and based at least in part on a suppression of the noise. - View Dependent Claims (2, 3, 4)
-
-
5. A system comprising:
-
a first microphone to detect a first sound; a second microphone to detect a second sound; memory; one or more processors; and one or more computer-executable instructions stored in the memory and executable by the one or more processors to perform operations comprising; determining that the first sound is representative of at least a portion of a target voice; determining that the second sound is representative of at least a portion of noise; implementing a delay with respect to a first audio signal that represents the noise and refraining from delaying a second audio signal that represents the target voice; terminating the delay based at least in part on detecting the noise; processing, by a first adaptive filter, the target voice to generate a target voice estimate, the target voice estimate representing a first estimate of the target voice of a user associated with the first sound; processing, by the first adaptive filter, the noise to generate a noise estimate, the noise estimate representing a second estimate of the noise within an environment associated with the user; and generating, by a second adaptive filter different from the first adaptive filter, an enhanced target voice based at least in part on the target voice estimate and the noise estimate. - View Dependent Claims (6, 7, 8, 9, 10, 11)
-
-
12. A method comprising:
-
determining that a first sound captured by a first microphone is representative of at least a portion of a target voice; determining that a second sound captured by a second microphone is representative of at least a portion of noise; implementing a delay with respect to a first audio signal that represents the noise and refraining from delaying a second audio signal that represents the target voice; terminating the delay based at least in part on detecting the noise; processing, by a first adaptive filter, the target voice to generate a target voice estimate, the target voice estimate representing a first estimate of the target voice of a user associated with the first sound; processing, by the first adaptive filter, the noise to generate a noise estimate, the noise estimate representing a second estimate of the noise within an environment associated with the user; and generating, by a second adaptive filter different from the first adaptive filter, an enhanced target voice based at least in part on at least one of the target voice estimate or the noise estimate. - View Dependent Claims (13, 14, 15)
-
-
16. A method comprising:
-
detecting a first sound representative of a target voice and a second sound representative of noise, the first sound being captured by a first microphone and the second sound being captured by a second microphone; implementing a delay with respect to a first audio signal that represents the noise and refraining from delaying a second audio signal that represents the target voice; terminating the delay based at least in part on detecting the noise; processing, by a first adaptive filter, the target voice to generate a target voice estimate, the target voice estimate representing a first estimate of the target voice of a user associated with the first sound; processing, by the first adaptive filter, the noise to generate a noise estimate, the noise estimate representing a second estimate of the noise within an environment associated with the user; and generating, by a second adaptive filter different from the first adaptive filter, an enhanced target voice based at least in part on at least one of the target voice estimate or the noise estimate. - View Dependent Claims (17, 18, 19, 20)
-
Specification