Low-power noise characterization over a distributed speech recognition channel
First Claim
1. A method of creating a statistical model of noise in a distributed speech recognition system, comprising:
- selecting one of a first power mode, a second power mode, and a third power mode to determine an amount of power to be drawn from a power source;
determining when to provide a noise floor estimate based at least in part on the selected power mode;
generating a parametric representation of the noise floor estimate when the noise floor estimate is provided;
determining whether received data includes a parametric representation of noise; and
creating a statistical model of noise feature vectors based on the parametric representation of the noise floor estimate;
wherein the first power mode involves activating noise estimation and feature extraction components upon assertion of speech activity, the second power mode involves deactivating the noise estimation and feature extraction components after the speech activity ends, and a third power mode involves activating noise estimation and feature extraction components upon assertion of speech activity and allowing the noise estimation and feature extraction components to remain active as long as a speech-enabled application remains active.
2 Assignments
0 Petitions
Accused Products
Abstract
A distributed speech recognition system includes a noise floor estimator to provide a noise floor estimate to a feature extractor which provides a parametric representation of the noise floor estimate. An encoder is included to to generate an encoded parametric representation of the noise floor estimate. A front-end controller is also included to determine when at least one of the noise floor estimator, the feature extractor, and the encoder is to be turned on or off and to determine when the noise floor estimator is to provide the noise floor estimate to the feature extractor. Additionally, a decoder is included to generate a decoded parametric representation of the noise floor estimate. A noise model generator creates a statistical model of noise feature vectors based on the decoded parametric representation of the noise floor estimate.
8 Citations
25 Claims
-
1. A method of creating a statistical model of noise in a distributed speech recognition system, comprising:
-
selecting one of a first power mode, a second power mode, and a third power mode to determine an amount of power to be drawn from a power source; determining when to provide a noise floor estimate based at least in part on the selected power mode; generating a parametric representation of the noise floor estimate when the noise floor estimate is provided; determining whether received data includes a parametric representation of noise; and creating a statistical model of noise feature vectors based on the parametric representation of the noise floor estimate; wherein the first power mode involves activating noise estimation and feature extraction components upon assertion of speech activity, the second power mode involves deactivating the noise estimation and feature extraction components after the speech activity ends, and a third power mode involves activating noise estimation and feature extraction components upon assertion of speech activity and allowing the noise estimation and feature extraction components to remain active as long as a speech-enabled application remains active. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. An article comprising:
-
a computer-readable storage medium having stored thereon computer-executable instructions that when executed by a machine result in the following; selecting one of a first power mode, a second power mode, and a third power mode to determine an amount of power to be drawn from a power source; determining when to provide a noise floor estimate based at least in part on the selected power mode; generating a parametric representation of the noise floor estimate when the noise floor estimate is provided; determining whether received data includes a parametric representation of noise; and creating a statistical model of noise feature vectors based on the parametric representation of the noise floor estimate; wherein the first power mode involves activating noise estimation and feature extraction components upon assertion of speech activity, the second power mode involves deactivating the noise estimation and feature extraction components after the speech activity ends, and a third power mode involves activating noise estimation and feature extraction components upon assertion of speech activity and allowing the noise estimation and feature extraction components to remain active as lone as a speech-enabled application remains active. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A distributed speech recognition system, comprising:
-
a first processing device, including; a transform module to receive input speech, a noise floor estimator to provide a noise floor estimate for the input speech, a feature extractor to provide a parametric representation of the noise floor estimate and the input speech, and a front-end controller to select one of a first power mode, a second power mode, and a third power mode to determine an amount of power to be drawn from a power source, and to determine when the noise floor estimator provides a noise floor estimate based at least in part on the selected power mode; a transmitter to transmit the parametric representation of the noise floor estimate and the input speech; a receiver to receive the parametric representation of the noise floor estimate and the input speech from the transmitter; and a second processing device, including; a noise model generator to create a statistical noise model based on the parametric representation of the noise floor estimate, and a speech recognizer to recognize the input speech based on acoustic models, the acoustic models adapted based at least in part on the statistical noise model. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24, 25)
-
Specification