System and method for modeless large vocabulary speech recognition
First Claim
1. A method for operating a modeless large vocabulary continuous speech recognition system that represents an input utterance as a sequence of input vectors, the method comprising:
- (a) providing, for speech processing in the speech recognition system, a common library of acoustic model states for arrangement in sequences that form acoustic models;
(b) comparing, for speech processing in the speech recognition system, each vector in a sequence of input vectors to a set of model states in the common library to produce a match score for each model state in the set reflecting the likelihood that such state is represented by such vector; and
(c) using, for speech processing in the speech recognition system, in a plurality of recognition modules operating in parallel, the match scores with the acoustic models to determine at least one recognition result in each of the recognition modules.
8 Assignments
0 Petitions
Accused Products
Abstract
A modeless large vocabulary continuous speech recognition system is provided that represents an input utterance as a sequence of input vectors. The system includes a common library of acoustic model states for arrangement in sequences that form acoustic models. Each acoustic model is composed of a sequence of segment models and each segment model is composed of a sequence of model states. An input processor compares each vector in a sequence of input vectors to a set of model states in the common library to produce a match score for each model state in the set, reflecting the likelihood that a state is represented by a vector. The system also includes a plurality of recognition modules and associated recognition grammars. The recognition modules operate in parallel and use the match scores with the acoustic models to determine at least one recognition result in each of the recognition modules. The recognition modules includes a dictation module for producing at least one probable dictation recognition result, a select module for recognizing a portion of visually displayed text for processing with a command, and a command module for producing at least one probable command recognition result. An arbitrator uses an arbitration algorithm and a score ordered queue of recognition results, together with their associated recognition modules, to compare the recognition results of the recognition modules to select at least one system recognition result.
60 Citations
24 Claims
-
1. A method for operating a modeless large vocabulary continuous speech recognition system that represents an input utterance as a sequence of input vectors, the method comprising:
-
(a) providing, for speech processing in the speech recognition system, a common library of acoustic model states for arrangement in sequences that form acoustic models;
(b) comparing, for speech processing in the speech recognition system, each vector in a sequence of input vectors to a set of model states in the common library to produce a match score for each model state in the set reflecting the likelihood that such state is represented by such vector; and
(c) using, for speech processing in the speech recognition system, in a plurality of recognition modules operating in parallel, the match scores with the acoustic models to determine at least one recognition result in each of the recognition modules. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
(d) comparing the recognition results of the recognition modules to select at least one system recognition result.
-
-
7. A method according to claim 6, wherein the step of comparing uses an arbitration algorithm and a score ordered queue of recognition results and associated recognition modules.
-
8. A method according to claim 1, wherein the plurality of recognition modules includes a dictation module for producing at least one probable dictation recognition result.
-
9. A method according to claim 1, wherein the plurality of recognition modules includes a command module for producing at least one probable command recognition result.
-
10. A method according to claim 1, wherein the plurality of recognition modules includes a select module for recognizing a portion of visually displayed text for processing with a command.
-
11. A method according to claim 1, wherein the plurality of recognition modules includes a dictation module for producing at least one probable dictation recognition result, a select module for recognizing a portion of visually displayed text for processing with a command, and a command module for producing at least one probable command recognition result.
-
12. A method for operating a modeless large vocabulary continuous speech recognition system that represents an input utterance as a sequence of input vectors, the method comprising:
-
(a) providing a common library of acoustic model states for arrangement in sequences that form acoustic models, wherein each acoustic model is composed of a sequence of segment models and each segment model is composed of a sequence of model states;
(b) comparing each vector in a sequence of input vectors to a set of model states in the common library to produce a match score for each model state in the set reflecting the likelihood that such state is represented by such vector;
(c) in a plurality of recognition modules operating in parallel, each having an associated recognition grammar, using the match scores with the acoustic models to determine at least one recognition result in each of the recognition modules, wherein the plurality of recognition modules includes a dictation module for producing at least one probable dictation recognition result, a select module for recognizing a portion of visually displayed text for processing with a command, and a command module for producing at least one probable command recognition result; and
(d) comparing the recognition results of the recognition modules with an arbitration algorithm and a score ordered queue of recognition results and associated recognition modules to select at least one system recognition result.
-
-
13. A modeless large vocabulary continuous speech recognition system that represents an input utterance as a sequence of input vectors, the system comprising:
-
a common library of acoustic model states for arrangement in sequences that form acoustic models for speech processing in the speech recognition system;
an input processor, for speech processing in the speech recognition system, that compares each vector in a sequence of input vectors to a set of model states in the common library to produce a match score for each model state in the set reflecting the likelihood that such state is represented by such vector; and
a plurality of recognition modules operating in parallel, for speech processing in the speech recognition system, that use the match scores with the acoustic models to determine at least one recognition result in each of the recognition modules. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
(d) an arbitrator that compares the recognition results of the recognition modules to select at least one system recognition result.
-
-
19. A system according to claim 18, wherein the arbitrator includes an arbitration algorithm and a score ordered queue of recognition results and associated recognition modules.
-
20. A system according to claim 13, wherein the plurality of recognition modules includes a dictation module for producing at least one probable dictation recognition result.
-
21. A system according to claim 13, wherein the plurality of recognition modules includes a command module for producing at least one probable command recognition result.
-
22. A system according to claim 13, wherein the plurality of recognition modules includes a select module for recognizing a portion of visually displayed text for processing with a command.
-
23. A system according to claim 13, wherein the plurality of recognition modules includes a dictation module for producing at least one probable dictation recognition result, a select module for recognizing a portion of visually displayed text for processing with a command, and a command module for producing at least one probable command recognition result.
-
24. A modeless large vocabulary continuous speech recognition system that represents an input utterance as a sequence of input vectors, the system comprising:
-
a common library of acoustic model states for arrangement in sequences that form acoustic models, wherein each acoustic model is composed of a sequence of segment models and each segment model is composed of a sequence of model states;
an input processor that compares each vector in a sequence of input vectors to a set of model states in the common library to produce a match score for each model state in the set reflecting the likelihood that such state is represented by such vector;
a plurality of recognition modules and associated recognition grammars, the modules operating in parallel and using the match scores with the acoustic models to determine at least one recognition result in each of the recognition modules, wherein the plurality of recognition modules includes a dictation module for producing at least one probable dictation recognition result, a select module for recognizing a portion of visually displayed text for processing with a command, and a command module for producing at least one probable command recognition result; and
an arbitrator that uses an arbitration algorithm and a score ordered queue of recognition results together with their associated recognition modules to compare the recognition results of the recognition modules to select at least one system recognition result.
-
Specification