Method and apparatus for constructing and using syllable-like unit language models
First Claim
1. A speech recognition system having a language model generated through a process comprising:
- breaking each word in a dictionary into syllable-like units;
for each word, grouping the syllable-like units of the word into n-grams;
counting the total number of n-gram occurrences in the dictionary; and
for each n-gram, counting the number of occurrences of the n-gram in the dictionary and dividing this count by the total number of n-gram occurrences to form a language model probability for the n-gram.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and computer-readable medium use syllable-like units (SLUs) to decode a pronunciation into a phonetic description. The syllable-like units are generally larger than a single phoneme but smaller than a word. The present invention provides a means for defining these syllable-like units and for generating a language model based on these syllable-like units that can be used in the decoding process. As SLUs are longer than phonemes, they contain more acoustic contextual clues and better lexical constraints for speech recognition. Thus, the phoneme accuracy produced from SLU recognition is much better than all-phone sequence recognition.
42 Citations
9 Claims
-
1. A speech recognition system having a language model generated through a process comprising:
-
breaking each word in a dictionary into syllable-like units;
for each word, grouping the syllable-like units of the word into n-grams;
counting the total number of n-gram occurrences in the dictionary; and
for each n-gram, counting the number of occurrences of the n-gram in the dictionary and dividing this count by the total number of n-gram occurrences to form a language model probability for the n-gram. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method of forming a language model based on syllable-like units, the method comprising:
-
for each word in a dictionary of words, dividing the word into syllable-like units to produce a set of syllable-like units;
applying a constraint to the syllable-like units to identify at least one syllable-like unit that should not form part of any word in the dictionary;
for each word in the dictionary of words, dividing the word into syllable-like units to form a second set of syllable-like units that does not include the identified syllable-like unit; and
forming the language model based on the second set of syllable-like units. - View Dependent Claims (7, 8, 9)
-
Specification