Low-power noise characterization over a distributed speech recognition channel

US 7,171,356 B2
Filed: 06/28/2002
Issued: 01/30/2007
Est. Priority Date: 06/28/2002
Status: Expired due to Fees

First Claim

Patent Images

1. A method of creating a statistical model of noise in a distributed speech recognition system, comprising:

selecting one of a first power mode, a second power mode, and a third power mode to determine an amount of power to be drawn from a power source;

determining when to provide a noise floor estimate based at least in part on the selected power mode;

generating a parametric representation of the noise floor estimate when the noise floor estimate is provided;

determining whether received data includes a parametric representation of noise; and

creating a statistical model of noise feature vectors based on the parametric representation of the noise floor estimate;

wherein the first power mode involves activating noise estimation and feature extraction components upon assertion of speech activity, the second power mode involves deactivating the noise estimation and feature extraction components after the speech activity ends, and a third power mode involves activating noise estimation and feature extraction components upon assertion of speech activity and allowing the noise estimation and feature extraction components to remain active as long as a speech-enabled application remains active.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A distributed speech recognition system includes a noise floor estimator to provide a noise floor estimate to a feature extractor which provides a parametric representation of the noise floor estimate. An encoder is included to to generate an encoded parametric representation of the noise floor estimate. A front-end controller is also included to determine when at least one of the noise floor estimator, the feature extractor, and the encoder is to be turned on or off and to determine when the noise floor estimator is to provide the noise floor estimate to the feature extractor. Additionally, a decoder is included to generate a decoded parametric representation of the noise floor estimate. A noise model generator creates a statistical model of noise feature vectors based on the decoded parametric representation of the noise floor estimate.

8 Citations

View as Search Results

25 Claims

1. A method of creating a statistical model of noise in a distributed speech recognition system, comprising:
- selecting one of a first power mode, a second power mode, and a third power mode to determine an amount of power to be drawn from a power source;
  
  determining when to provide a noise floor estimate based at least in part on the selected power mode;
  
  generating a parametric representation of the noise floor estimate when the noise floor estimate is provided;
  
  determining whether received data includes a parametric representation of noise; and
  
  creating a statistical model of noise feature vectors based on the parametric representation of the noise floor estimate;
  
  wherein the first power mode involves activating noise estimation and feature extraction components upon assertion of speech activity, the second power mode involves deactivating the noise estimation and feature extraction components after the speech activity ends, and a third power mode involves activating noise estimation and feature extraction components upon assertion of speech activity and allowing the noise estimation and feature extraction components to remain active as long as a speech-enabled application remains active.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method according to claim 1, wherein determining whether the received data includes the parametric representation of noise comprises determining whether the received data includes a packet with a start sync sequence and an end sync sequence.
  - 3. The method according to claim 1, further comprising calculating the noise floor estimate, based on an output from a transform module, and providing the noise floor estimate to an analysis module.
  - 4. The method according to claim 1, wherein the received data includes the parametric representation of the noise floor estimate.
  - 5. The method according to claim 1, wherein the statistical model of noise is used for acoustic model adaptation.
  - 6. The method according to claim 1, wherein the second power mode further involves enabling the noise estimation and feature extraction components during intervals when speech is not present.
  - 7. The method according to claim 1, wherein creating the statistical model of the noise feature vectors includes providing a mean and a variance of a Mel-cepstrum vector.

8. An article comprising:
- a computer-readable storage medium having stored thereon computer-executable instructions that when executed by a machine result in the following;
  
  selecting one of a first power mode, a second power mode, and a third power mode to determine an amount of power to be drawn from a power source;
  
  determining when to provide a noise floor estimate based at least in part on the selected power mode;
  
  generating a parametric representation of the noise floor estimate when the noise floor estimate is provided;
  
  determining whether received data includes a parametric representation of noise; and
  
  creating a statistical model of noise feature vectors based on the parametric representation of the noise floor estimate;
  
  wherein the first power mode involves activating noise estimation and feature extraction components upon assertion of speech activity, the second power mode involves deactivating the noise estimation and feature extraction components after the speech activity ends, and a third power mode involves activating noise estimation and feature extraction components upon assertion of speech activity and allowing the noise estimation and feature extraction components to remain active as lone as a speech-enabled application remains active.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The article according to claim 8, wherein determining whether the received data includes the parametric representation of noise comprises determining whether the received data includes a packet with a start sync sequence and an end sync sequence.
  - 10. The article according to claim 8, wherein the instructions further result in calculating the noise floor estimate, based on an output from a transform module, and providing the noise floor estimate to an analysis module.
  - 11. The article according to claim 8, wherein the received data includes the parametric representation of the noise floor estimate.
  - 12. The article according to claim 8, wherein the statistical model of noise is used for acoustic model adaptation.
  - 13. The article according to claim 8, wherein the second power mode further involves enabling the noise estimation and feature extraction components during intervals when speech is not present.
  - 14. The article according to claim 8, wherein creating the statistical model of the noise feature vectors includes providing a mean and a variance of a Mel-cepstrum vector.

15. A distributed speech recognition system, comprising:
- a first processing device, including;
  
  a transform module to receive input speech,a noise floor estimator to provide a noise floor estimate for the input speech,a feature extractor to provide a parametric representation of the noise floor estimate and the input speech, anda front-end controller to select one of a first power mode, a second power mode, and a third power mode to determine an amount of power to be drawn from a power source, and to determine when the noise floor estimator provides a noise floor estimate based at least in part on the selected power mode;
  
  a transmitter to transmit the parametric representation of the noise floor estimate and the input speech;
  
  a receiver to receive the parametric representation of the noise floor estimate and the input speech from the transmitter; and
  
  a second processing device, including;
  
  a noise model generator to create a statistical noise model based on the parametric representation of the noise floor estimate, anda speech recognizer to recognize the input speech based on acoustic models, the acoustic models adapted based at least in part on the statistical noise model.
- View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24, 25)
- - 16. The system according to claim 15, wherein the transmitter and the first processing device form a single device.
  - 17. The system according to claim 15, wherein the receiver and the second processing device form a single device.
  - 18. The system according to claim 15, wherein the first processing device comprises a handheld computer.
  - 19. The system according to claim 15, wherein the second processing device comprises a server computer.
  - 20. The system according to claim 15, wherein the first processing device further comprises an encoder to compress the parametric representation of the noise floor estimate and the input speech and to generate an encoded representation thereof. before the transmitter transmits the parametric representation of the noise floor estimate and the input speech to the receiver.
  - 21. The system according to claim 20, wherein the second processing device further comprises a decoder to decompress the encoded parametric representation of the noise floor estimate and the input speech and to generate an decoded representation thereof.
  - 22. The system according to claim 21, wherein the second processing device further comprises a speech/noise de-multiplexer to receive data from the decoder and to determine whether the received data represents noise.
  - 23. The system according to claim 21, wherein the decoder is adapted to decode a packet having a start sync sequence and an end sync sequence, the packet including the encoded parametric representation of the noise floor estimate.
  - 24. The system according to claim 15, wherein the noise floor estimator is selectively coupled between a transform module and an analysis module, the transform module filtering an input signal, and the analysis module performing a data reduction transform.
  - 25. The system according to claim 15, wherein the second processing device further comprises an acoustic model adapter to adapt the acoustic models using the statistical noise model.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Intel Corporation
Original Assignee
Intel Corporation
Inventors
Morris, Robert W, Deisher, Michael E
Primary Examiner(s)
Storm; Donald L.

Application Number

US10/185,576
Publication Number

US 20040002860A1
Time in Patent Office

1,677 Days
Field of Search

None
US Class Current

704/228
CPC Class Codes

G10L 21/0208 Noise filtering

G10L 25/48 specially adapted for parti...

Low-power noise characterization over a distributed speech recognition channel

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

8 Citations

25 Claims

Specification

Solutions

Use Cases

Quick Links

Low-power noise characterization over a distributed speech recognition channel

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

8 Citations

25 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links