System and method for modeless large vocabulary speech recognition

US 6,292,779 B1
Filed: 03/09/1999
Issued: 09/18/2001
Est. Priority Date: 03/09/1998
Status: Expired due to Term

First Claim

Patent Images

1. A method for operating a modeless large vocabulary continuous speech recognition system that represents an input utterance as a sequence of input vectors, the method comprising:

(a) providing, for speech processing in the speech recognition system, a common library of acoustic model states for arrangement in sequences that form acoustic models;

(b) comparing, for speech processing in the speech recognition system, each vector in a sequence of input vectors to a set of model states in the common library to produce a match score for each model state in the set reflecting the likelihood that such state is represented by such vector; and

(c) using, for speech processing in the speech recognition system, in a plurality of recognition modules operating in parallel, the match scores with the acoustic models to determine at least one recognition result in each of the recognition modules.

View all claims

8 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A modeless large vocabulary continuous speech recognition system is provided that represents an input utterance as a sequence of input vectors. The system includes a common library of acoustic model states for arrangement in sequences that form acoustic models. Each acoustic model is composed of a sequence of segment models and each segment model is composed of a sequence of model states. An input processor compares each vector in a sequence of input vectors to a set of model states in the common library to produce a match score for each model state in the set, reflecting the likelihood that a state is represented by a vector. The system also includes a plurality of recognition modules and associated recognition grammars. The recognition modules operate in parallel and use the match scores with the acoustic models to determine at least one recognition result in each of the recognition modules. The recognition modules includes a dictation module for producing at least one probable dictation recognition result, a select module for recognizing a portion of visually displayed text for processing with a command, and a command module for producing at least one probable command recognition result. An arbitrator uses an arbitration algorithm and a score ordered queue of recognition results, together with their associated recognition modules, to compare the recognition results of the recognition modules to select at least one system recognition result.

60 Citations

View as Search Results

24 Claims

1. A method for operating a modeless large vocabulary continuous speech recognition system that represents an input utterance as a sequence of input vectors, the method comprising:
- (a) providing, for speech processing in the speech recognition system, a common library of acoustic model states for arrangement in sequences that form acoustic models;
  
  (b) comparing, for speech processing in the speech recognition system, each vector in a sequence of input vectors to a set of model states in the common library to produce a match score for each model state in the set reflecting the likelihood that such state is represented by such vector; and
  
  (c) using, for speech processing in the speech recognition system, in a plurality of recognition modules operating in parallel, the match scores with the acoustic models to determine at least one recognition result in each of the recognition modules.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. A method according to claim 1, wherein each acoustic model is composed of a sequence of segment models and each segment model is composed of a sequence of model states.
  - 3. A method according to claim 1, wherein the match score is a probability calculation or a distance measure calculation.
  - 4. A method according to claim 1, wherein each recognition module includes a recognition grammar used with the acoustic models to determine the at least one recognition result.
  - 5. A method according to claim 4, wherein the recognition grammar is a context-free grammar, a natural language grammar, or a dynamic command grammar.
  - 6. A method according to claim 1, further including:
7. A method according to claim 6, wherein the step of comparing uses an arbitration algorithm and a score ordered queue of recognition results and associated recognition modules.
8. A method according to claim 1, wherein the plurality of recognition modules includes a dictation module for producing at least one probable dictation recognition result.
9. A method according to claim 1, wherein the plurality of recognition modules includes a command module for producing at least one probable command recognition result.
10. A method according to claim 1, wherein the plurality of recognition modules includes a select module for recognizing a portion of visually displayed text for processing with a command.
11. A method according to claim 1, wherein the plurality of recognition modules includes a dictation module for producing at least one probable dictation recognition result, a select module for recognizing a portion of visually displayed text for processing with a command, and a command module for producing at least one probable command recognition result.

12. A method for operating a modeless large vocabulary continuous speech recognition system that represents an input utterance as a sequence of input vectors, the method comprising:
- (a) providing a common library of acoustic model states for arrangement in sequences that form acoustic models, wherein each acoustic model is composed of a sequence of segment models and each segment model is composed of a sequence of model states;
  
  (b) comparing each vector in a sequence of input vectors to a set of model states in the common library to produce a match score for each model state in the set reflecting the likelihood that such state is represented by such vector;
  
  (c) in a plurality of recognition modules operating in parallel, each having an associated recognition grammar, using the match scores with the acoustic models to determine at least one recognition result in each of the recognition modules, wherein the plurality of recognition modules includes a dictation module for producing at least one probable dictation recognition result, a select module for recognizing a portion of visually displayed text for processing with a command, and a command module for producing at least one probable command recognition result; and
  
  (d) comparing the recognition results of the recognition modules with an arbitration algorithm and a score ordered queue of recognition results and associated recognition modules to select at least one system recognition result.

13. A modeless large vocabulary continuous speech recognition system that represents an input utterance as a sequence of input vectors, the system comprising:
- a common library of acoustic model states for arrangement in sequences that form acoustic models for speech processing in the speech recognition system;
  
  an input processor, for speech processing in the speech recognition system, that compares each vector in a sequence of input vectors to a set of model states in the common library to produce a match score for each model state in the set reflecting the likelihood that such state is represented by such vector; and
  
  a plurality of recognition modules operating in parallel, for speech processing in the speech recognition system, that use the match scores with the acoustic models to determine at least one recognition result in each of the recognition modules.
- View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
- - 14. A system according to claim 13, wherein each acoustic model is composed of a sequence of segment models and each segment model is composed of a sequence of model states.
  - 15. A system according to claim 13, wherein the match score is a probability calculation or a distance measure calculation.
  - 16. A system according to claim 13, wherein each recognition module includes a recognition grammar used with the acoustic models to determine the at least one recognition result.
  - 17. A system according to claim 16, wherein the recognition grammar is a context-free grammar, a natural language grammar, or a dynamic command grammar.
  - 18. A system according to claim 13, further including:
19. A system according to claim 18, wherein the arbitrator includes an arbitration algorithm and a score ordered queue of recognition results and associated recognition modules.
20. A system according to claim 13, wherein the plurality of recognition modules includes a dictation module for producing at least one probable dictation recognition result.
21. A system according to claim 13, wherein the plurality of recognition modules includes a command module for producing at least one probable command recognition result.
22. A system according to claim 13, wherein the plurality of recognition modules includes a select module for recognizing a portion of visually displayed text for processing with a command.
23. A system according to claim 13, wherein the plurality of recognition modules includes a dictation module for producing at least one probable dictation recognition result, a select module for recognizing a portion of visually displayed text for processing with a command, and a command module for producing at least one probable command recognition result.

24. A modeless large vocabulary continuous speech recognition system that represents an input utterance as a sequence of input vectors, the system comprising:
- a common library of acoustic model states for arrangement in sequences that form acoustic models, wherein each acoustic model is composed of a sequence of segment models and each segment model is composed of a sequence of model states;
  
  an input processor that compares each vector in a sequence of input vectors to a set of model states in the common library to produce a match score for each model state in the set reflecting the likelihood that such state is represented by such vector;
  
  a plurality of recognition modules and associated recognition grammars, the modules operating in parallel and using the match scores with the acoustic models to determine at least one recognition result in each of the recognition modules, wherein the plurality of recognition modules includes a dictation module for producing at least one probable dictation recognition result, a select module for recognizing a portion of visually displayed text for processing with a command, and a command module for producing at least one probable command recognition result; and
  
  an arbitrator that uses an arbitration algorithm and a score ordered queue of recognition results together with their associated recognition modules to compare the recognition results of the recognition modules to select at least one system recognition result.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
Lernout & Hauspie Speech Products NV (Intel Corporation)
Inventors
Grabherr, Manfred, Wilson, Brian, Ganong, William F. III, Sarukkai, Ramesh
Primary Examiner(s)
Dorvil, Richemond

Application Number

US09/267,925
Time in Patent Office

924 Days
Field of Search

704/200, 704/238, 704/231, 704/256, 704/257, 704/255, 704/251, 704/239, 704/240
US Class Current

704/257
CPC Class Codes

G10L 15/193   Formal grammars, e.g. finit...

G10L 15/26   Speech to text systems G10L...

G10L 2015/223   Execution procedure of a sp...

G10L 2015/228   of application context

System and method for modeless large vocabulary speech recognition

First Claim

8 Assignments

0 Petitions

Accused Products

Abstract

60 Citations

24 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for modeless large vocabulary speech recognition

First Claim

8 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

60 Citations

24 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links