Speech recognition system distinguishing dictation from commands by arbitration between continuous speech and isolated word modules
First Claim
1. A speech recognition system which separately outputs text and commands comprising:
- an isolated word speech recognizer;
accessible by said isolated word speech recognizer, a first vocabulary of respective text word models, said isolated word speech recognizer operating to compare speech input with at least a selected portion of said first vocabulary and to provide a plurality of scores indicating the degree of match of said speech input with an identified one or more of said models;
a continuous speech recognizer;
accessible by said continuous speech recognizer, a second vocabulary of respective command word models, said continuous speech recognizer operating to compare speech input to said second vocabulary of command word models and to provide a score indicating the degree of match of said speech input with at least one identified sequence of the respective models;
an arbitration algorithm for selecting from among the models identified by said isolated word speech recognizer and the sequence of models identified by said continuous speech recognizer and for outputting corresponding text if a score from said isolated word recognizer is selected and outputting a respective command if a score from said continuous speech recognizer is selected.
11 Assignments
0 Petitions
Accused Products
Abstract
In the speech recognition system disclosed herein, an input utterance is submitted to both a large vocabulary isolated word speech recognition module and a small vocabulary continuous speech recognition module. The small vocabulary contains command words which can be combined in sequences to define commands to an application program. The two recognition modules generate respective scores for identified large vocabulary models and for sequences of small vocabulary models. The score provided by the continuous speech recognizer is normalized on the basis of the length of the speech input utterance and an arbitration algorithm selects among the candidates identified by the recognition modules. Without requiring the user to switch modes, text is output if a score from the isolated word recognizer is selected and a command is output if a score from the continuous speech recognizer is selected.
107 Citations
10 Claims
-
1. A speech recognition system which separately outputs text and commands comprising:
-
an isolated word speech recognizer; accessible by said isolated word speech recognizer, a first vocabulary of respective text word models, said isolated word speech recognizer operating to compare speech input with at least a selected portion of said first vocabulary and to provide a plurality of scores indicating the degree of match of said speech input with an identified one or more of said models; a continuous speech recognizer; accessible by said continuous speech recognizer, a second vocabulary of respective command word models, said continuous speech recognizer operating to compare speech input to said second vocabulary of command word models and to provide a score indicating the degree of match of said speech input with at least one identified sequence of the respective models; an arbitration algorithm for selecting from among the models identified by said isolated word speech recognizer and the sequence of models identified by said continuous speech recognizer and for outputting corresponding text if a score from said isolated word recognizer is selected and outputting a respective command if a score from said continuous speech recognizer is selected. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A speech recognition system which separately outputs text and commands comprising:
-
an isolated word speech recognizer; accessible by said isolated word speech recognizer, a first vocabulary of respective text word models numbering in excess of 5000, said isolated word speech recognizer operating to compare speech input with at least a selected portion of said first vocabulary and to provide a plurality of scores indicating the degree of match of said speech input with identified ones of said models; a continuous speech recognizer; accessible by said continuous speech recognizer, a second vocabulary of respective command word models numbering less than 2000, said continuous speech recognizer operating to compare speech input to said second vocabulary of models and to provide a score indicating the degree of match of said speech input with an identified sequence of the respective models; means for normalizing the score provided by said continuous speech recognizer on the basis of the length of the speech input; an arbitration algorithm for selecting from among the models identified by said isolated word speech recognizer and the sequence of models identified by said continuous speech recognizer and for outputting corresponding text if a score from said isolated word recognizer is selected and outputting a respective command if a score from said continuous speech recognizer is selected. - View Dependent Claims (8)
-
-
9. A speech recognition system which separately outputs text and commands comprising:
-
an isolated word speech recognizer; accessible by said isolated word speech recognizer, a vocabulary of respective text word models numbering in excess of 5000, said isolated word speech recognizer operating to compare speech input with at least a selected portion of said first vocabulary and to provide a plurality of scores indicating the degree of match of said speech input with identified ones of said models; a continuous speech recognizer; accessible by said continuous speech recognizer, a second vocabulary of respective command word models numbering less than 2000, said continuous speech recognizer operating to compare speech input to said second vocabulary of models and to provide a score indicating the degree of match of said speech input with an identified sequence of the respective models; means for normalizing the score provided by said continuous speech recognizer on the basis of the length of the speech input; means for applying a relative scaling to said recognizer scores by a factor empirically trained to minimize incursions by each of said vocabularies on correct results from the other vocabulary; and an arbitration algorithm for selecting from among the models identified by said isolated word speech recognizer and the sequence of models identified by said continuous speech recognizer and for outputting corresponding text if a score from said isolated word recognizer is selected and outputting a respective command if a score from said continuous speech recognizer is selected. - View Dependent Claims (10)
-
Specification