Method for adding phonetic descriptions to a speech recognition lexicon
First Claim
1. A method for adding an acoustic description of a word to a speech recognition lexicon, the method comprising:
- converting the text of the word into at least one orthographically derived acoustic description of the word;
generating a score for an orthographically derived acoustic description based in part on a comparison between the orthographically derived acoustic description and a speech signal representing a user'"'"'s pronunciation of the word;
decoding the speech signal representing the user'"'"'s pronunciation of the word to produce a decoded acoustic description of the word and a score for the decoded acoustic description; and
selecting one of the orthographically derived acoustic description and the decoded acoustic description as the acoustic description of the word based on the score for the orthographically derived acoustic description and the score for the decoded acoustic description.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and computer-readable medium convert the text of a word and a user'"'"'s pronunciation of the word into a phonetic description to be added to a speech recognition lexicon. Initially, two possible phonetic descriptions are generated. One phonetic description is formed from the text of the word. The other phonetic description is formed by decoding a speech signal representing the user'"'"'s pronunciation of the word. Both phonetic descriptions are scored based on their correspondence to the user'"'"'s pronunciation. The phonetic description with the highest score is then selected for entry in the speech recognition lexicon.
71 Citations
21 Claims
-
1. A method for adding an acoustic description of a word to a speech recognition lexicon, the method comprising:
-
converting the text of the word into at least one orthographically derived acoustic description of the word;
generating a score for an orthographically derived acoustic description based in part on a comparison between the orthographically derived acoustic description and a speech signal representing a user'"'"'s pronunciation of the word;
decoding the speech signal representing the user'"'"'s pronunciation of the word to produce a decoded acoustic description of the word and a score for the decoded acoustic description; and
selecting one of the orthographically derived acoustic description and the decoded acoustic description as the acoustic description of the word based on the score for the orthographically derived acoustic description and the score for the decoded acoustic description. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 18, 20, 21)
-
-
12. A computer-readable medium having computer-executable instructions for performing steps comprising:
-
receiving text of a word for which a phonetic description is to be added to a speech recognition lexicon;
receiving a representation of a speech signal produced by a person pronouncing the word;
converting the text of the word into a text-based phonetic description of the word;
generating a speech-based phonetic description of the word from the representation of the speech signal; and
selecting a phonetic description of the word to add to the speech recognition lexicon by selecting between the text-based phonetic description and the speech-based phonetic description based in part on the correspondence between each phonetic description and the representation of the speech signal.
-
-
19. A speech recognition system having a language model generated through a process comprising:
-
breaking each word in a dictionary into syllable-like units;
for each word, grouping the syllable-like units of the word into n-grams;
counting the total number of n-gram occurrences in the dictionary; and
for each n-gram, counting the number of occurrences of the n-gram in the dictionary and dividing this count by the total number of n-gram occurrences to form a language model probability for the n-gram.
-
Specification