Correcting substitution errors during automatic speech recognition by accepting a second best when first best is confusable
First Claim
1. A speech recognition method comprising the steps of:
- (a) receiving input speech containing vocabulary via a microphone associated with an automatic speech recognition system;
(b) processing the input speech with a grammar to obtain N-best hypotheses and associated parameter values using at least one processor associated with the automatic speech recognition system;
(c) cross-referencing a first-best hypothesis of the N-best hypotheses against a list of known confusable vocabulary to determine whether the first-best hypothesis of the N-best hypotheses is confusable with any of the known confusable vocabulary;
(d) accepting the first-best hypothesis as recognized speech corresponding to the received input speech, if the first-best hypothesis is not determined to be confusable with any of the known confusable vocabulary;
(e) comparing at least one parameter value of the first-best hypothesis to at least one threshold value, if the first-best hypothesis is determined to be confusable with any of the known confusable vocabulary;
(f) accepting the first-best hypothesis as recognized speech corresponding to the received input speech, if the at least one parameter value of the first- best hypothesis is greater than the at least one threshold value;
(g) determining if a second-best hypothesis of the N-best hypotheses is confusable with the first-best hypothesis, if the at least one parameter value of the first-best hypothesis is not greater than the at least one threshold value;
and (h) accepting the second-best hypothesis as recognized speech corresponding to the received input speech, if the second-best hypothesis is determined to be confusable with the first-best hypothesis;
(h1) determining if a confidence score of the second-best hypothesis is between lower and upper threshold values;
and (i) accepting the second-best hypothesis as recognized speech corresponding to the received input speech, if the confidence score is determined to be within the lower and upper threshold values.
16 Assignments
0 Petitions
Accused Products
Abstract
A speech recognition method includes the steps of receiving input speech containing vocabulary, processing the input speech with a grammar to obtain N-best hypotheses and associated parameter values, and determining whether a first-best hypothesis of the N-best hypotheses is confusable with any vocabulary within the grammar. The first-best hypothesis is accepted as recognized speech corresponding to the received input speech if the first-best hypothesis is not determined to be confusable with any vocabulary within the grammar. Where the first-best hypothesis is determined to be confusable, at least one parameter value of the first-best hypothesis can be compared to at least one threshold value, and accepting the second-best as the recognized speech, if its confidence score is within certain lower and upper threshold values and is not confusable with the first-best. The first-best hypothesis can be accepted as recognized speech corresponding to the received input speech, if the parameter value of the first-best hypothesis is greater than the threshold value.
18 Citations
12 Claims
-
1. A speech recognition method comprising the steps of:
-
(a) receiving input speech containing vocabulary via a microphone associated with an automatic speech recognition system; (b) processing the input speech with a grammar to obtain N-best hypotheses and associated parameter values using at least one processor associated with the automatic speech recognition system; (c) cross-referencing a first-best hypothesis of the N-best hypotheses against a list of known confusable vocabulary to determine whether the first-best hypothesis of the N-best hypotheses is confusable with any of the known confusable vocabulary; (d) accepting the first-best hypothesis as recognized speech corresponding to the received input speech, if the first-best hypothesis is not determined to be confusable with any of the known confusable vocabulary; (e) comparing at least one parameter value of the first-best hypothesis to at least one threshold value, if the first-best hypothesis is determined to be confusable with any of the known confusable vocabulary; (f) accepting the first-best hypothesis as recognized speech corresponding to the received input speech, if the at least one parameter value of the first- best hypothesis is greater than the at least one threshold value; (g) determining if a second-best hypothesis of the N-best hypotheses is confusable with the first-best hypothesis, if the at least one parameter value of the first-best hypothesis is not greater than the at least one threshold value; and (h) accepting the second-best hypothesis as recognized speech corresponding to the received input speech, if the second-best hypothesis is determined to be confusable with the first-best hypothesis; (h1) determining if a confidence score of the second-best hypothesis is between lower and upper threshold values; and (i) accepting the second-best hypothesis as recognized speech corresponding to the received input speech, if the confidence score is determined to be within the lower and upper threshold values. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A speech recognition method comprising the steps of:
-
(a) receiving input speech containing vocabulary via a microphone associated with an automatic speech recognition system; (b) processing the input speech with a grammar to obtain N-best hypotheses and associated parameter values using at least one processor associated with the automatic speech recognition system; (c) cross-referencing a first-best hypothesis of the N-best hypotheses against a list of known confusable vocabulary to determine whether the first-best hypothesis of the N-best hypotheses is confusable with any of the known confusable vocabulary; (d) accepting the first-best hypothesis as recognized speech corresponding to the received input speech, if the first-best hypothesis is not determined to be confusable with any of the known confusable vocabulary; (e) comparing at least one parameter value of the first-best hypothesis to at least one threshold value, if the first-best hypothesis is determined to be confusable with any of the known confusable vocabulary; and (f) accepting the first-best hypothesis as recognized speech corresponding to the received input speech, if the at least one parameter value of the first-best hypothesis is greater than the at least one threshold value; (g) determining if a second-best hypothesis of the N-best hypotheses is confusable with the first-best hypothesis, if the at least one parameter value of the first-best hypothesis is not greater than the at least one threshold value; (h) determining if a confidence score of the second-best hypothesis is between lower and upper threshold values; and (i) accepting the second-best hypothesis as recognized speech corresponding to the received input speech, if the confidence score is determined to be within the lower and upper threshold values. - View Dependent Claims (10, 11, 12)
-
Specification