Automatically determining words for updating in a pronunciation dictionary in a speech recognition system
First Claim
1. A method of determining the accuracy of a pronunciation dictionary so that the dictionary may be updated to improve its accuracy, comprising:
- providing a pronunciation dictionary having a plurality of entries, wherein each entry includes a word identifier and at least one phoneme string of an expected pronunciation of a word, each phoneme string having a plurality of phonemes;
receiving a plurality of actual utterances of a specific word from a plurality of users;
comparing each of the utterances to a phoneme string in the dictionary to generate a corresponding phoneme string score, wherein each phoneme string score indicates on a phoneme-by-phoneme basis the accuracy of the received utterance relative to the compared phoneme string;
evaluating the phoneme string scores to predetermined accuracy criteria to identify entries in the dictionary that should be updated.
5 Assignments
0 Petitions
Accused Products
Abstract
An approach for automatically determining the accuracy of a pronunciation dictionary in a speech recognition system involves comparing an expected pronunciation representation for a particular word from a pronunciation dictionary to one or more actual pronunciations of the particular word. An accuracy score for each of the phonemes that constitute the pronunciation of the particular word is determined from the comparison of the expected and actual pronunciations for the particular word. The accuracy score is evaluated against specified accuracy criteria to determine whether the expected pronunciation for the particular word satisfies the specified accuracy criteria. If the expected pronunciation does not satisfy the specified accuracy criteria for the particular word, then the expected pronunciation for the particular word in the pronunciation dictionary is identified as requiring updating. Manual or automated update mechanisms may then be employed to update the identified expected pronunciation representations to reflect the actual pronunciations.
-
Citations
12 Claims
-
1. A method of determining the accuracy of a pronunciation dictionary so that the dictionary may be updated to improve its accuracy, comprising:
-
providing a pronunciation dictionary having a plurality of entries, wherein each entry includes a word identifier and at least one phoneme string of an expected pronunciation of a word, each phoneme string having a plurality of phonemes;
receiving a plurality of actual utterances of a specific word from a plurality of users;
comparing each of the utterances to a phoneme string in the dictionary to generate a corresponding phoneme string score, wherein each phoneme string score indicates on a phoneme-by-phoneme basis the accuracy of the received utterance relative to the compared phoneme string;
evaluating the phoneme string scores to predetermined accuracy criteria to identify entries in the dictionary that should be updated. - View Dependent Claims (2, 3, 4)
computing, for each phoneme in the phoneme string, an average phoneme score from the corresponding phoneme scores of each of the actual utterances; determining if any of the average phoneme scores is below a threshold value;
if so, identifying the corresponding entry in the dictionary that has the phoneme string as needing updating.
-
-
4. The method of claim 2 wherein the method further comprises comparing the phoneme scores to a minimum score threshold and identifying the corresponding entry in the dictionary that has the phoneme string as needing updating if at least one of the phonemes in the string has a specified number of instances in which the phoneme score is below the minimum score threshold.
-
5. A computer readable medium carrying one or more sequences of instructions for determining the accuracy of a pronunciation dictionary so that the dictionary may be updated to improve its accuracy, the one or more sequences of instructions including instructions which, when executed by one or more processors, perform the steps of:
-
providing a pronunciation dictionary having a plurality of entries, wherein each entry includes a word identifier and at least one phoneme string of an expected pronunciation of a word, each phoneme string having a plurality of phonemes;
receiving a plurality of actual utterances of a specific word from a plurality of users;
comparing each of the utterances to a phoneme string in the dictionary to generate a corresponding phoneme string score, wherein each phoneme string score indicates on a phoneme-by-phoneme basis the accuracy of the received utterance relative to the compared phoneme string;
evaluating the phoneme string scores to predetermined accuracy criteria to identify entries in the dictionary that should be updated. - View Dependent Claims (6, 7, 8)
computing, for each phoneme in the phoneme string, an average phoneme score from the corresponding phoneme scores of each of the actual utterances; determining if any of the average phoneme scores is below a threshold value;
if so, identifying the corresponding entry in the dictionary that has the phoneme string as needing updating.
-
-
8. The computer readable medium of claim 6 wherein the instructions further perform the steps of
comparing the phoneme scores to a minimum score threshold and identifying the corresponding entry in the dictionary that has the phoneme string as needing updating if at least one of the phonemes in the string has a specified number of instances in which the phoneme score is below the minimum score threshold.
-
9. A speech recognition diagnostic tool to determine the accuracy of a pronunciation dictionary so that the dictionary may be updated to improve its accuracy, comprising:
-
a pronunciation dictionary having a plurality of entries, wherein each entry includes a word identifier and at least one phoneme string of an expected pronunciation of a word, each phoneme string having a plurality of phonemes;
logic to receive a plurality of actual utterances of a specific word from a plurality of users;
logic to compare each of the utterances to a phoneme string in the dictionary to generate a corresponding phoneme string score, wherein each phoneme string score indicates on a phoneme-by-phoneme basis the accuracy of the received utterance relative to the compared phoneme string;
logic to evaluate the phoneme string scores to predetermined accuracy criteria to identify entries in the dictionary that should be updated. - View Dependent Claims (10, 11, 12)
logic to compute, for each phoneme in the phoneme string, an average phoneme score from the corresponding phoneme scores of each of the actual utterances; logic to determine if any of the average phoneme scores is below a threshold value and, if so, to identify the corresponding entry in the dictionary that has the phoneme string as needing updating.
-
-
12. The speech recognition diagnostic tool of claim 10 further comprising
logic to compare the phoneme scores to a minimum score threshold and to identify the corresponding entry in the dictionary that has the phoneme string as needing updating if at least one of the phonemes in the string has a specified number of instances in which the phoneme score is below the minimum score threshold.
Specification