Dynamic speech recognition pattern switching for enhanced speech recognition accuracy

US 6,631,348 B1
Filed: 08/08/2000
Issued: 10/07/2003
Est. Priority Date: 08/08/2000
Status: Expired due to Fees

First Claim

Patent Images

1. A speech recognition system comprising:

a speech capturing device configured to capture an input utterance;

a speech recognition processing mechanism configured to process said input utterance captured by said speech capturing device and to generate an identified utterance signal representing a recognized utterance;

a sensor configured to detect a plurality of ambient noise levels and to supply a detected ambient noise level to said speech recognition processing mechanism; and

a speech model containing a plurality of stored reference pattern sets representing utterances to be recognized, each of said stored reference pattern sets based on training information corresponding to a select one of said ambient noise levels;

wherein, in response to receiving said input utterance and said detected ambient noise level, said speech recognition processing mechanism switches to a stored reference pattern set corresponding to said detected ambient noise level, determines a recognized utterance based on said corresponding stored reference pattern set, and generates a corresponding identified utterance signal.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech recognition system and method that dynamically switches between reference patterns based on training information produced under different ambient noise levels to enhance speech recognition accuracy, is presented herein. In accordance with an embodiment of the invention, the speech recognition system includes a speech capturing device configured to capture an input utterance and a speech recognition processing mechanism configured to process the input utterances and to generate an identified utterance signal representing a recognized utterance. The system further includes a sensor configured to detect a plurality of ambient noise levels and to supply a detected ambient noise level to the speech recognition processing mechanism and a speech model containing a plurality of stored reference pattern sets representing utterances to be recognized. Each of the stored reference pattern sets are based on training information corresponding to a particular ambient noise levels. As such, in response to receiving the input utterance and detected ambient noise level, the speech recognition processing mechanism switches to the stored reference pattern set corresponding to the detected ambient noise level and determines a recognized utterance by comparing the input utterance to the utterances contained in the corresponding stored reference pattern set. The speech recognition processing mechanism then generates a corresponding identified utterance signal, indicating a recognized utterance, which is applied to related applications to execute predetermined tasks.

Citations

25 Claims

1. A speech recognition system comprising:
- a speech capturing device configured to capture an input utterance;
  
  a speech recognition processing mechanism configured to process said input utterance captured by said speech capturing device and to generate an identified utterance signal representing a recognized utterance;
  
  a sensor configured to detect a plurality of ambient noise levels and to supply a detected ambient noise level to said speech recognition processing mechanism; and
  
  a speech model containing a plurality of stored reference pattern sets representing utterances to be recognized, each of said stored reference pattern sets based on training information corresponding to a select one of said ambient noise levels;
  
  wherein, in response to receiving said input utterance and said detected ambient noise level, said speech recognition processing mechanism switches to a stored reference pattern set corresponding to said detected ambient noise level, determines a recognized utterance based on said corresponding stored reference pattern set, and generates a corresponding identified utterance signal.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The speech recognition system of claim 1, wherein said training information is generated by articulating a plurality of training utterances for each of said ambient noise levels during a training mode.
  - 3. The speech recognition system of claim 2, wherein each of said stored reference pattern sets are constructed by applying at least one of statistical pattern matching techniques and word representation models based on said training information for each of said ambient noise levels.
  - 4. The speech recognition system of claim 3, wherein said speech recognition processing mechanism determines said recognized utterance by,
- 5. The speech recognition system of claim 4, further including an application configured to receive said identified utterance signal and to execute a predetermined task based on said identified utterance signal.

6. A speech recognition method comprising:
- capturing an input utterance and supplying said input utterance to a speech recognition processing mechanism;
  
  detecting an ambient noise level and supplying said detected ambient noise level to said speech recognition processing mechanism; and
  
  constructing a plurality of stored reference pattern sets representing utterances to be recognized, each of said stored reference pattern sets based on training information corresponding to a particular ambient noise level, wherein, in response to receiving said input utterance and said detected ambient noise level, said speech recognition processing mechanism switches to a stored reference pattern set corresponding to said detected ambient noise level, determines a recognized utterance based on said corresponding stored reference pattern set, and generates a corresponding identified utterance signal.
- View Dependent Claims (7, 8, 9, 10)
- - 7. The speech recognition method of claim 6, wherein said training information is generated by articulating a plurality of training utterances for each of said ambient noise levels during a training mode.
  - 8. The speech recognition method of claim 7, wherein each of said stored reference pattern sets are constructed by applying at least one of statistical pattern matching techniques and word representation models based on said training information for each of said ambient noise levels.
  - 9. The speech recognition method of claim 8, wherein said speech recognition processing mechanism determines said recognized utterance by,
- 10. The speech recognition method of claim 9, further including,executing, by an application, a predetermined task based on said identified utterance signal received from said speech recognition processing mechanism.

11. A computer-readable medium encoded with a plurality of processor-executable instruction sequences for:
- capturing an input utterance and supplying said input utterance to a speech recognition processing mechanism;
  
  detecting an ambient noise level and supplying said detected ambient noise level to said speech recognition processing mechanism; and
  
  constructing a plurality of stored reference pattern sets representing utterances to be recognized, each of said stored reference pattern sets based on training information corresponding to a particular ambient noise level, wherein, in response to receiving said input utterance and said detected ambient noise level, said speech recognition processing mechanism switches to a stored reference pattern set corresponding to said detected ambient noise level, determines a recognized utterance based on said corresponding stored reference pattern set, and generates a corresponding identified utterance signal.
- View Dependent Claims (12, 13, 14, 15)
- - 12. The computer-readable medium of claim 11, wherein said training information is generated by articulating a plurality of training utterances for each of said ambient noise levels during a training mode.
  - 13. The computer-readable medium of claim 12, wherein each of said stored reference pattern sets are constructed by applying at least one of statistical pattern matching techniques and word representation models based on said training information for each of said ambient noise levels.
  - 14. The computer-readable medium of claim 13, wherein said speech recognition processing mechanism determines said recognized utterance by,
- 15. The computer-readable medium of claim 14, further including,executing, by an application, a predetermined task based on said recognized utterance signal received from said speech recognition processing mechanism.

16. A speech recognition system comprising:
- a speech capturing device configured to capture an input utterance;
  
  a speech recognition processing mechanism configured to digitize said input utterance captured by said speech capturing device, to assemble said digitized input utterance into frames, to extract acoustical information from said frames, and to generate an identified utterance signal representing a recognized utterance;
  
  a sensor configured to detect a plurality of ambient noise levels and to supply a detected ambient noise level to said speech recognition processing mechanism; and
  
  a speech model containing a plurality of stored reference pattern sets representing utterances to be recognized, each of said stored reference pattern sets based on training information corresponding to a select one of said ambient noise levels;
  
  wherein, in response to receiving said input utterance and said detected ambient noise level, said speech recognition processing mechanism switches to a stored reference pattern set corresponding to said detected ambient noise level, determines a recognized utterance based on said corresponding stored reference pattern set, and generates a corresponding identified utterance signal.
- View Dependent Claims (17, 18, 19, 20)
- - 17. The speech recognition system of claim 16, wherein said training information is generated by articulating a plurality of training utterances for each of said ambient noise levels during a training mode.
  - 18. The speech recognition system of claim 17, wherein each of said stored reference pattern sets are constructed by applying at least one of statistical pattern matching techniques and word representation models based on said training information for each of said ambient noise levels.
  - 19. The speech recognition system of claim 18, wherein said speech recognition processing mechanism determines said recognized utterance by,
- 20. The speech recognition system of claim 19, further including an application configured to receive said identified utterance signal and to execute a predetermined task based on said identified utterance signal.

21. A speech recognition method comprising:
- capturing an input utterance and supplying said input utterance to a speech recognition processing mechanism;
  
  digitizing said input utterance by said speech recognition processing mechanism;
  
  assembling said digitized input utterance into frames by said speech recognition processing mechanism;
  
  extracting acoustical information from said frames by said speech recognition processing mechanism;
  
  detecting an ambient noise level and supplying said detected ambient noise level to said speech recognition processing mechanism; and
  
  constructing a plurality of stored reference pattern sets representing utterances to be recognized, each of said stored reference pattern sets based on training information corresponding to a particular ambient noise level, wherein, in response to receiving said input utterance and said detected ambient noise level, said speech recognition processing mechanism switches to a stored reference pattern set corresponding to said detected ambient noise level, determines a recognized utterance based on said corresponding stored reference pattern set, and generates a corresponding identified utterance signal.
- View Dependent Claims (22, 23, 24, 25)
- - 22. The speech recognition method of claim 21, wherein said training information is generated by articulating a plurality of training utterances for each of said ambient noise levels during a training mode.
  - 23. The speech recognition method of claim 22, wherein each of said stored reference pattern sets are constructed by applying at least one of statistical pattern matching techniques and word representation models based on said training information for each of said ambient noise levels.
  - 24. The speech recognition method of claim 23, wherein said speech recognition processing mechanism determines said recognized utterance by,
- 25. The speech recognition method of claim 21, further including,executing, by an application, a predetermined task based on said identified utterance signal received from said speech recognition processing mechanism.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Intel Corporation
Original Assignee
Intel Corporation
Inventors
Wymore, Ben S.
Primary Examiner(s)
Knepper, David D.

Application Number

US09/634,843
Time in Patent Office

1,155 Days
Field of Search

704/226-228, 704/231, 704/233, 704/250
US Class Current

704/233
CPC Class Codes

G10L 15/20 Speech recognition techniqu...

G10L 21/0216 characterised by the method...

Dynamic speech recognition pattern switching for enhanced speech recognition accuracy

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

25 Claims

Specification

Solutions

Use Cases

Quick Links

Dynamic speech recognition pattern switching for enhanced speech recognition accuracy

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

25 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links