Method and system for learning linguistically valid word pronunciations from acoustic data
First Claim
1. A computerized pronunciation system configured to generate pronunciations for words that are represented by waveforms and text, such that the pronunciations are spelled by phones in a phonetic alphabet for storage in a pronunciation dictionary, the system comprising:
- a word list including at least one word;
transcribed acoustic data including at least one waveform for the word and transcribed text associated with the waveform;
a pronunciation-learning module configured to accept as input the word list and the transcribed acoustic data, the pronunciation-learning module including;
sets of initial pronunciations of the word,a scoring module configured score pronunciations and to generate phone probabilities, anda set of alternate pronunciations of the word, wherein the set of alternate pronunciations include a highest-scoring set of initial pronunciations with a highest-scoring substitute phone substituted for a lowest-probability phone; and
a pronunciation dictionary configured to receive the highest-scoring set of initial pronunciations and the set of alternate pronunciations.
5 Assignments
0 Petitions
Accused Products
Abstract
A computerized pronunciation system is provided for generating pronunciations for words and storing the pronunciations in a pronunciation dictionary. The system includes a word list including at least one word; transcribed acoustic data including at least one waveform for the word and transcribed text associated with the waveform; a pronunciation-learning module configured to accept as input the word list and the transcribed acoustic data, the pronunciation-learning module including: sets of initial pronunciations of the word, a scoring module configured score pronunciations and to generate phone probabilities, and a set of alternate pronunciations of the word, wherein the set of alternate pronunciations include a highest-scoring set of initial pronunciations with a highest-scoring substitute phone substituted for a lowest-probability phone; and a pronunciation dictionary configured to receive the highest-scoring set of initial pronunciations and the set of alternate pronunciations.
220 Citations
31 Claims
-
1. A computerized pronunciation system configured to generate pronunciations for words that are represented by waveforms and text, such that the pronunciations are spelled by phones in a phonetic alphabet for storage in a pronunciation dictionary, the system comprising:
-
a word list including at least one word; transcribed acoustic data including at least one waveform for the word and transcribed text associated with the waveform; a pronunciation-learning module configured to accept as input the word list and the transcribed acoustic data, the pronunciation-learning module including; sets of initial pronunciations of the word, a scoring module configured score pronunciations and to generate phone probabilities, and a set of alternate pronunciations of the word, wherein the set of alternate pronunciations include a highest-scoring set of initial pronunciations with a highest-scoring substitute phone substituted for a lowest-probability phone; and a pronunciation dictionary configured to receive the highest-scoring set of initial pronunciations and the set of alternate pronunciations. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A computerized pronunciation system configured to generate pronunciations for words that are represented by waveforms and text, such that the pronunciations are spelled by phones in a phonetic alphabet for storage in a pronunciation dictionary, the system comprising:
-
a word list including at least one word; transcribed acoustic data including at least one waveform for the word and transcribed text associated with the waveform; a pronunciation-learning module configured to accept as input the word list and the transcribed acoustic data, the pronunciation-learning module including; sets of initial pronunciations of the word, an automatic speech recognition (ASR) system configured to score pronunciations, a scoring module configured to generate phone probabilities, and a set of alternate pronunciations of the word, wherein the set of alternate pronunciations include a highest-scoring set of initial pronunciations with a highest-scoring substitute phone substituted for a lowest-probability phone; and a pronunciation dictionary configured to receive the highest-scoring initial pronunciation and a highest-scoring set of alternate pronunciations. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24, 25)
-
-
26. A computerized pronunciation system configured to generate pronunciations for words that are represented by waveforms and text, such that the pronunciations are spelled by phones in a phonetic alphabet for storage in a pronunciation dictionary, the system comprising:
-
a word list including a plurality of words; transcribed acoustic data including a set of waveforms for each of the words and a set of transcribed text corresponding to the waveforms; a pronunciation-learning module configured to accept as input the word list and the transcribed acoustic data, the pronunciation-learning module including; sets of initial pronunciations of the plurality of words, sets of alternate pronunciations of the plurality of words, wherein each set of alternate pronunciations includes a highest-scoring set of initial pronunciations with a unique substitute phone substituted for a lowest-probability phone of the highest-scoring set of initial pronunciations; a scoring module configured score the sets of initial and alternate pronunciations and to generate phone probabilities; and a pronunciation dictionary configured to receive the highest-scoring initial pronunciation and a highest-scoring set of alternate pronunciations. - View Dependent Claims (27, 28, 29, 30, 31)
-
Specification