Method and apparatus for generating models of spoken words based on a small number of utterances
First Claim
1. A method of modeling a word uttered at least two times, each utterance having at least one acoustic feature having a value, said method comprising the steps of:
- measuring the value of the acoustic feature of each utterance;
storing a selection set of one or more probabilistic word model signals, each probabilistic word model signal in the selection set representing a probabilistic model of the word;
calculating, for the selection set, a match score representing the closeness of a match between the probabilistic word models in the selection set and the value of the acoustic feature of each utterance;
storing a candidate set of one or more probabilistic word model signals, each probabilistic word model signal in the candidate set representing a probabilistic model of the word, each probabilistic word model in the candidate set being different from each probabilistic word model in the selection set;
storing an expansion set comprising the probabilistic word model signals in the selection set and one probabilistic word model signal from the candidate set;
calculating, for the expansion set, a match score representing the closeness of a match between the probabilistic word models in the expansion set and the value of the acoustic feature of each utterance; and
modeling the word with the word models in the expansion set if the expansion set match score surpasses the selection set match score by a nonzero threshold value.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and apparatus for modeling words based on match scores representing (a) the closeness of a match between probabilistic word models and the acoustic features of at least two utterances, and (b) the closeness of a match between word models and the spelling of the word. A match score is calculated for a selection set of one or more probabilistic word models. A match score is also calculated for an expansion set comprising the probabilistic word models in the selection set and one probabilistic word model from a candidate set. If the expansion set match score improves the selection set match score by a selected nonzero threshold value, the word is modelled with the word models in the expansion set. If the expansion set match score does not improve the selection set match score by the selected nonzero threshold value, the word is modelled with the words in the selection set.
31 Citations
16 Claims
-
1. A method of modeling a word uttered at least two times, each utterance having at least one acoustic feature having a value, said method comprising the steps of:
-
measuring the value of the acoustic feature of each utterance; storing a selection set of one or more probabilistic word model signals, each probabilistic word model signal in the selection set representing a probabilistic model of the word; calculating, for the selection set, a match score representing the closeness of a match between the probabilistic word models in the selection set and the value of the acoustic feature of each utterance; storing a candidate set of one or more probabilistic word model signals, each probabilistic word model signal in the candidate set representing a probabilistic model of the word, each probabilistic word model in the candidate set being different from each probabilistic word model in the selection set; storing an expansion set comprising the probabilistic word model signals in the selection set and one probabilistic word model signal from the candidate set; calculating, for the expansion set, a match score representing the closeness of a match between the probabilistic word models in the expansion set and the value of the acoustic feature of each utterance; and modeling the word with the word models in the expansion set if the expansion set match score surpasses the selection set match score by a nonzero threshold value. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method of modeling words, said method comprising the steps of:
-
measuring the value of at least one feature of a first utterance of a word during each of a series of successive time intervals to produce a first series of feature vector signals representing the feature values of the first utterance; measuring the value of at least one feature of a second utterance of the same word during each of a series of successive time intervals to produce a second series of feature vector signals representing the feature values of the second utterance; storing two or more probabilistic word model signals, each probabilistic word model signal representing a probabilistic model of the word; calculating, for each probabilistic word model and for each utterance, a match score representing the closeness of a match between the probabilistic word model and the series of feature vector signals produced by the utterance; calculating, for each probabilistic word model, an average-model match score representing the average match score for the word model and all utterances; selecting a first probabilistic word model having the best average-model match score; selecting a second probabilistic word model; identifying, for each utterance, a best-of-set match score representing the best match score between the utterance and the first and second probabilistic word models; calculating a set-average match score representing the average best-of-set match score for the first and second probabilistic word models and all utterances; and modeling the word with both the first and second probabilistic word models if the set-average match score surpasses the best average-model match score by a nonzero threshold value. - View Dependent Claims (8, 9)
-
-
10. An apparatus for modeling words, said apparatus comprising:
-
means for measuring the value of at least one acoustic feature of each of at least two utterances of a word; means for storing a selection set of one or more probabilistic word model signals, each probabilistic word model signal in the selection set representing a probabilistic model of the word; means for calculating, for the selection set, a match score representing the closeness of a match between the probabilistic word models in the selection set and the value of the acoustic feature of each utterance; means for storing a candidate set of one or more probabilistic word model signals, each probabilistic word model signal in the candidate set representing a probabilistic model of the word, each probabilistic word model in the candidate set being different from each probabilistic word model in the selection set; means for storing an expansion set comprising the probabilistic word model signals in the selection set and one probabilistic word model signal from the candidate set; means for calculating, for the expansion set, a match score representing the closeness of a match between the probabilistic word models in the expansion set and the value of the acoustic feature of each utterance; and means for modeling the word with the word models in the expansion set if the expansion set match score surpasses the selection set match score by a nonzero threshold value. - View Dependent Claims (11, 12, 13, 14, 15, 16)
-
Specification