System and method for learning alternate pronunciations for speech recognition
First Claim
1. A method for learning pronunciation in a given language comprising the steps of:
- a. training an acoustic model on a large speech corpus to distinguish phonemes;
b. constructing a phoneme confusion matrix;
c. constructing a phoneme replacement candidate list for each phoneme in a set of speech data containing pronunciations for recognition;
d. learning alternative pronunciations of a word that has been mispronounced;
e. combining said learned alternative pronunciations with a linguistic dictionary to create a pooled dictionary; and
f. pruning said pooled dictionary to limit the number of learned alternative pronunciations in order to create an improved dictionary.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for learning alternate pronunciations for speech recognition is disclosed. Alternative name pronunciations may be covered, through pronunciation learning, that have not been previously covered in a general pronunciation dictionary. In an embodiment, the detection of phone-level and syllable-level mispronunciations in words and sentences may be based on acoustic models trained by Hidden Markov Models. Mispronunciations may be detected by comparing the likelihood of the potential state of the targeting pronunciation unit with a pre-determined threshold through a series of tests. It is also within the scope of an embodiment to detect accents.
15 Citations
37 Claims
-
1. A method for learning pronunciation in a given language comprising the steps of:
-
a. training an acoustic model on a large speech corpus to distinguish phonemes; b. constructing a phoneme confusion matrix; c. constructing a phoneme replacement candidate list for each phoneme in a set of speech data containing pronunciations for recognition; d. learning alternative pronunciations of a word that has been mispronounced; e. combining said learned alternative pronunciations with a linguistic dictionary to create a pooled dictionary; and f. pruning said pooled dictionary to limit the number of learned alternative pronunciations in order to create an improved dictionary. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30)
-
-
31. A method for learning alternative pronunciations for speech in a given language comprising the steps of:
-
a. selecting a word instance for learning alternative pronunciations; b. performing a first test on the word instance to determine a baseline recognition result; c. performing hierarchical pronunciation learning on the word instance and selecting a pronunciation that is similar to the word instance; and d. performing an other test to assess if the selected pronunciation is recognized as the word instance wherein if the word is recognized, adding the selected pronunciation to a dictionary, otherwise, discarding the selected pronunciation. - View Dependent Claims (32, 33, 34, 35)
-
-
36. A system for language learning of mispronunciation detection comprising:
-
a. a lexicon builder which is capable of integrating one or more of;
pronunciation dictionaries, spelling-to-pronunciation interpretations, and text normalizations, to create a list of acceptable phoneme sequences;b. a speech corpus; c. an acoustic model; d. a word lexicon; e. a word grammar; f. a grammar-based recognizer which provides a hypothesized name based on the speech corpus, acoustic model, word lexicon, and the word grammar to a means for scoring; and g. a means for scoring which indicates accuracy of the hypothesized name. - View Dependent Claims (37)
-
Specification