Confidence estimation based on frequency
First Claim
1. A computer-implemented method of determining a confidence score for a potential speech recognition result, the method comprising:
- determining a first group of words, wherein each word of the first group of words was predicted a first number of times during automatic speech recognition (ASR) testing;
determining a first total number of predictions for the first group, the first total number of predictions corresponding to the first number multiplied by a number of words in the first group of words;
determining a first total number of correct results for the first group, the first total number of correct results corresponding to a cumulative number of times each word of the first group was correctly predicted during the ASR testing;
determining a first group prior probability for the first group by dividing the first total number of correct results by the first total number of predictions;
determining a first prior probability for a first word using the first group prior probability;
receiving audio data corresponding to a first utterance; and
performing first ASR processing on the audio data to generate a hypothesis, the hypothesis including the first word, wherein performing first ASR processing further comprises determining a confidence score for the hypothesis using the first prior probability for the first word.
1 Assignment
0 Petitions
Accused Products
Abstract
Devices, systems and methods are disclosed for estimating a prior probability for speech recognition by taking into account a number of observations of a particular word and a prior probability for a group of words having a similar number of observations. For example, a prior probability may be determined by combining a number of correct results and a number of observations for a group of words and calculating a prior probability of the entire group. Further, a prior probability may be determined for a word that was not previously observed by determining a prior probability for a group of words that have been observed once. The prior probability for a particular word may be determined differently as the number of observations increases and may transition from the group prior probability to an individual prior probability when the number of observations exceeds a threshold.
103 Citations
24 Claims
-
1. A computer-implemented method of determining a confidence score for a potential speech recognition result, the method comprising:
-
determining a first group of words, wherein each word of the first group of words was predicted a first number of times during automatic speech recognition (ASR) testing; determining a first total number of predictions for the first group, the first total number of predictions corresponding to the first number multiplied by a number of words in the first group of words; determining a first total number of correct results for the first group, the first total number of correct results corresponding to a cumulative number of times each word of the first group was correctly predicted during the ASR testing; determining a first group prior probability for the first group by dividing the first total number of correct results by the first total number of predictions; determining a first prior probability for a first word using the first group prior probability; receiving audio data corresponding to a first utterance; and performing first ASR processing on the audio data to generate a hypothesis, the hypothesis including the first word, wherein performing first ASR processing further comprises determining a confidence score for the hypothesis using the first prior probability for the first word. - View Dependent Claims (2, 3, 4)
-
-
5. A computer-implemented method, the method comprising:
-
receiving audio data corresponding to a first utterance; performing first automatic speech recognition (ASR) processing on the audio data using a language model to generate a hypothesis, the hypothesis including a first word, wherein; the language model was trained using a first prior probability for the first word, the first prior probability based on a group prior probability for a plurality of words, the group prior probability based on a total number of predictions of the plurality of words during ASR testing and a total number of correct predictions of the plurality of words during the ASR testing, at least one of the plurality of words is not included in the hypothesis, and performing the first ASR processing on the audio data comprises; identifying the first word, and determining a confidence score for the hypothesis using the language model; and generating output data using the hypothesis. - View Dependent Claims (6, 7, 8, 9, 10, 11, 12, 21, 22, 23)
-
-
13. A system, comprising:
-
at least one processor; memory including instructions operable to be executed by the at least one processor to cause the system to; receive audio data corresponding to a first utterance; perform first automatic speech recognition (ASR) processing on the audio data using a language model to generate a hypothesis, the hypothesis including a first word, wherein; the language model was trained using a first prior probability for the first word, the first prior probability based on a group prior probability for a plurality of words, the group prior probability based on a total number of predictions of the plurality of words during ASR testing and a total number of correct predictions of the plurality of words during the ASR testing, at least one of the plurality of words is not included in the hypothesis, and performing the first ASR processing on the audio data comprises; identifying the first word, and determining a confidence score for the hypothesis using the language model; and generate output data using the hypothesis. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
-
24. A computer-implemented method, the method comprising:
-
receiving audio data corresponding to a first utterance; performing first automatic speech recognition (ASR) processing on the audio data using a language model to generate a hypothesis, the hypothesis including a first word, wherein; the language model was trained using a first prior probability for the first word, the first prior probability determined by; determining that a first number of predictions of the first word during ASR testing is equal to zero, indicating that the first word was not predicted during the ASR testing, determining a plurality of words, wherein each of the plurality of words was predicted during the ASR testing more than once, determining a total number of predictions of the plurality of words during ASR testing, determining a total number of correct predictions of the plurality of words during the ASR testing, determining a group prior probability for the plurality of words using the total number of predictions and the total number of correct predictions, and setting the group prior probability as the first prior probability, at least one of the plurality of words is not included in the hypothesis, and performing first ASR processing on the audio data comprises; identifying the first word, and determining a confidence score for the hypothesis using the language model; and generating output data using the hypothesis.
-
Specification