Speech recognition accuracy with multi-confidence thresholds
First Claim
1. A system for processing an input utterance, comprising:
- a processor;
a speech recognition engine that causes the processor to provide a recognition result corresponding to the input utterance and a confidence score corresponding to a confidence level in the recognition result;
a threshold selection component that selects, based on the input utterance, a threshold value corresponding to the input utterance;
wherein, the selected threshold value is determined based on classification of the input utterance into a partition of multiple partitions in a set of training data;
wherein, each of the multiple partitions is associated with a threshold value;
wherein, the selected threshold value corresponding to the input utterance is the threshold value associated with the partition into which the input utterance is classified.
3 Assignments
0 Petitions
Accused Products
Abstract
A speech recognition system uses multiple confidence thresholds to improve the quality of speech recognition results. The choice of which confidence threshold to use for a particular utterance may be based on one or more features relating to the utterance. In one particular implementation, the speech recognition system includes a speech recognition engine that provides speech recognition results and a confidence score for an input utterance. The system also includes a threshold selection component that determines, based on the received input utterance, a threshold value corresponding to the input utterance. The system further includes a threshold component that accepts the recognition results based on a comparison of the confidence score to the threshold value.
256 Citations
26 Claims
-
1. A system for processing an input utterance, comprising:
-
a processor; a speech recognition engine that causes the processor to provide a recognition result corresponding to the input utterance and a confidence score corresponding to a confidence level in the recognition result; a threshold selection component that selects, based on the input utterance, a threshold value corresponding to the input utterance; wherein, the selected threshold value is determined based on classification of the input utterance into a partition of multiple partitions in a set of training data; wherein, each of the multiple partitions is associated with a threshold value; wherein, the selected threshold value corresponding to the input utterance is the threshold value associated with the partition into which the input utterance is classified. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer-readable medium having stored thereon a set of instructions which when executed causes a processor to perform a method of processing input information, the method, comprising:
-
generating a recognition result corresponding to the input information; determining a confidence score corresponding to a confidence level in the accuracy of the speech recognition result; classifying the input information into one of a plurality of partitions defined from training data based on a feature relating to the input information or to the user; wherein, each of the multiple partitions is associated with each of a plurality of threshold values; determining a threshold value from the plurality of threshold values; wherein, the determined threshold value is one of the plurality of threshold values that is associated with the partition into which the input utterance is classified; determining whether to accept or reject the recognition result based on the determined threshold value and the confidence score. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A computer-readable medium having stored thereon a set of instructions which when executed causes a processor to perform a method comprising:
-
obtaining training data; defining partitions for the training data based on a feature associated with the training data; and determining a confidence threshold for each partition based on the feature, wherein, in run-time operation, input information is converted into recognition results and the input information detected as having the feature is classified into one of the partitions of the training data defined using the feature and accepted or rejected as valid recognition results based on a comparison of the confidence threshold corresponding to the one of the partitions defined using the feature. - View Dependent Claims (20, 21, 22, 23)
-
-
24. A computer-readable medium having stored thereon a set of instructions which when executed causes a processor to perform a method of generating a recognition result from input information, the method, comprising:
-
defining multiple partitions for training data based on features associated with the training data; generating multiple threshold values each corresponding to each of the multiple partitions of the training data; wherein, the input information having a particular feature is classified into a partition of the multiple partitions that is defined using the particular feature; wherein, the recognition result is accepted or rejected based on comparison of the confidence threshold generated for the partition.
-
-
25. A system, comprising:
-
means for, defining multiple partitions for training data based on features associated with the training data; means for, generating multiple threshold values each corresponding to each of the multiple partitions of the training data; means for, generating a recognition result from input information; means for, classifying the input information having a particular feature into a partition of the multiple partitions that is defined using the particular feature; wherein, the recognition result is accepted or rejected based on comparison of the confidence threshold generated for the partition of the multiple partitions.
-
-
26. A system, comprising:
-
means for, receiving an input utterance, to provide recognition results corresponding to the input utterance, and to provide a confidence score corresponding to a confidence level in the recognition results; means for, determining, based on the input utterance, a threshold value corresponding to the input utterance; wherein, the threshold value is determined by classification of the input utterance into one of a set of partitions defined from training data; wherein, the classification of the received input utterance is performed based on a feature associated with the input utterance; means for, accepting the recognition result based on a comparison of the confidence score to the threshold value; wherein the threshold component accepts the recognition result when the confidence score is above the threshold value and rejects the recognition result when the confidence score is below the threshold value.
-
Specification