New-word pronunciation learning using a pronunciation graph
First Claim
1. A computer-readable medium including instructions readable by a computer which, when implemented perform steps comprising:
- generating a speech-based phonetic description of a word without reference to the text of the word;
generating a text-based phonetic description of the word based on the text of the word;
aligning the speech-based phonetic description and the text-based phonetic description on a phone-by-phone basis to form a single graph; and
selecting a phonetic description from the single graph.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and computer-readable medium convert the text of a word and a user'"'"'s pronunciation of the word into a phonetic description to be added to a speech recognition lexicon. Initially, a plurality of at least two possible phonetic descriptions are generated. One phonetic description is formed by decoding a speech signal representing a user'"'"'s pronunciation of the word. At least one other phonetic description is generated from the text of the word. The plurality of possible sequences comprising speech-based and text-based phonetic descriptions are aligned and scored in a single graph based on their correspondence to the user'"'"'s pronunciation. The phonetic description with the highest score is then selected for entry in the speech recognition lexicon.
127 Citations
27 Claims
-
1. A computer-readable medium including instructions readable by a computer which, when implemented perform steps comprising:
-
generating a speech-based phonetic description of a word without reference to the text of the word;
generating a text-based phonetic description of the word based on the text of the word;
aligning the speech-based phonetic description and the text-based phonetic description on a phone-by-phone basis to form a single graph; and
selecting a phonetic description from the single graph. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer-readable medium having computer-executable instructions for performing steps comprising:
-
receiving text of a word for which a phonetic pronunciation is to be added to a speech recognition lexicon;
receiving a representation of a speech signal produced by a person pronouncing the word;
converting the text of the word into at least one text-based phonetic sequence of phonetic units;
generating a speech-based phonetic sequence of phonetic units from the representation of the speech signal;
placing the phonetic units of the at least one text-based phonetic sequence and the speech-based phonetic sequence in a search structure that allows for transitions between phonetic units in the text-based phonetic sequence and phonetic units in the speech-based phonetic description; and
selecting a phonetic pronunciation from the search structure. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A method for adding an acoustic description of a word to a speech recognition lexicon, the method comprising:
-
generating a text-based phonetic description based on the text of a word;
generating a speech-based phonetic description without reference to the text of the word;
aligning the text-based phonetic description and the speech based phonetic description in a structure, the structure comprising paths representing phonetic units, at least one path for a phonetic unit from the text-based phonetic description being connected to a path for a phonetic unit from the speech-based phonetic description;
selecting a sequence of paths through the structure; and
generating the acoustic description of the word based on the selected sequence of paths. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27)
-
Specification