APPARATUSES AND METHODS FOR ENHANCED SPEECH RECOGNITION IN VARIABLE ENVIRONMENTS
First Claim
1. An integrated circuit device, comprising:
- a background noise estimation module, the background noise estimation module to receive an input signal from a reference microphone, when voice activity is not detected the background noise estimation module to average the input signal from the reference microphone to form an estimated average background noise level;
at least two threshold values, each of the at least two threshold values to correspond to a different estimated average background noise level; and
selection logic, the selection logic to assign a particular estimated average background noise level to a threshold value from the at least two threshold values, wherein the threshold value is adapted to the particular estimated average background noise level, the threshold value is to be used by the desired voice activity detector (DVAD) to detect when desired voice activity is present.
4 Assignments
0 Petitions
Accused Products
Abstract
Systems, apparatuses, and methods are described to increase a signal-to-noise ratio difference between a main channel and reference channel. The increased signal-to-noise ratio difference is accomplished with an adaptive threshold for a desired voice activity detector (DVAD) and shaping filters. The DVAD includes averaging an output signal of a reference microphone channel to provide an estimated average background noise level. A threshold value is selected from a plurality of threshold values based on the estimated average background noise level. The threshold value is used to detect desired voice activity on a main microphone channel.
80 Citations
30 Claims
-
1. An integrated circuit device, comprising:
-
a background noise estimation module, the background noise estimation module to receive an input signal from a reference microphone, when voice activity is not detected the background noise estimation module to average the input signal from the reference microphone to form an estimated average background noise level; at least two threshold values, each of the at least two threshold values to correspond to a different estimated average background noise level; and selection logic, the selection logic to assign a particular estimated average background noise level to a threshold value from the at least two threshold values, wherein the threshold value is adapted to the particular estimated average background noise level, the threshold value is to be used by the desired voice activity detector (DVAD) to detect when desired voice activity is present. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. An apparatus, comprising:
-
an adaptive threshold module;
the adaptive threshold module comprising;a background noise estimation module, the background noise estimation module to receive an input signal from a reference microphone, when voice activity is not detected the background noise estimation module to average the input signal from the reference microphone to form an estimated average background noise level; logic, the logic to assign an estimated background noise level to a threshold value; a first shaping filter, the first shaping filter to filter the reference signal to remove a noise component to provide a filtered reference signal with enhanced signal-to-noise ratio; a second shaping filter, the second shaping filter to filter a main signal from a main microphone, to remove the noise component to provide a filtered main signal with enhanced signal-to-noise ratio; a desired voice activity detector, the desired voice activity detector utilizes the filtered main signal, normalized by the filtered reference signal, and the threshold value to obtain a desired voice activity signal with enhanced signal-to-noise ratio difference; and a noise cancellation module, the noise cancellation module is electrically coupled to the desired voice activity detector, the desired voice activity signal is to be used by the noise cancellation module to identify desired speech during noise cancellation. - View Dependent Claims (9, 10)
-
-
11. A method, comprising:
-
averaging an output signal of a reference microphone channel to provide an estimated average background noise level; selecting a threshold value from a plurality of threshold values based on the estimated average background noise level; and using the threshold value to detect desired voice activity on a main microphone channel. - View Dependent Claims (12, 13, 14, 15, 16, 17)
-
-
18. An apparatus, comprising:
-
a first signal path configured to receive a main microphone signal; a first shaping filter coupled to the first signal path, the first shaping filter to filter the main microphone signal, wherein the first shaping filter filters a noise component from the main microphone signal to increase a signal-to-noise ratio of the main microphone signal; a second signal path configured to receive a reference microphone signal; a second shaping filter coupled to the second signal path, the second shaping filter to filter the reference microphone signal, wherein the second shaping filter to increase a signal-to-noise ratio of the reference microphone signal and the second shaping filter to provide substantially the same filtering as the first shaping filter; a desired voice activity detector (DVAD), the DVAD is coupled to an output of the first shaping filter and an output of the second shaping filter, the DVAD to form a normalized main signal with increased signal-to-noise ratio, the normalized main signal is to be used during identification of desired voice activity. - View Dependent Claims (19, 20, 21, 22)
-
-
23. A system, comprising:
-
a data processing system, the data processing system is configured to process acoustic signals; and a computer readable medium containing executable computer program instructions, which when executed by the data processing system, cause the data processing system to perform a method comprising; averaging an output signal of a reference microphone channel to provide an estimated average background noise level; selecting a threshold value from a plurality of threshold values based on the estimated average background noise level; and using the threshold value to detect desired voice activity on a main microphone channel. - View Dependent Claims (24, 25, 26, 27, 28, 29, 30)
-
Specification