Speech recognition using log-linear model
First Claim
1. A method performed by one or more computers, the method comprising:
- obtaining n-gram parameter values derived from an n-gram language model, the n-gram parameter values including n-gram parameter values for n-grams that include multiple words;
determining n-gram features for a log-linear language model based on the n-grams corresponding to the obtained n-gram parameter values;
determining a weight for each of the determined n-gram features, wherein for at least some of the n-gram features, the weight is determined based on (i) an n-gram parameter value that is derived from the n-gram language model and that corresponds to a particular n-gram of multiple words, and (ii) an n-gram parameter value that is derived from the n-gram language model and that corresponds to an n-gram that is a sub-sequence within the particular n-gram;
generating a log-linear language model having the determined n-gram features, the determined n-gram features in the log-linear language model having weights that are initialized based on the determined weights;
after generating the log-linear language model, training the log-linear language model to adjust the initialized weights;
after training the log-linear language model, using the log-linear language model to determine a transcription for an utterance; and
providing the transcription for the utterance.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, relating to generating log-linear models. In some implementations, n-gram parameter values derived from an n-gram language model are obtained. N-gram features for a log-linear language model are determined based on the n-grams corresponding to the obtained n-gram parameter values. A weight for each of the determined n-gram features is determined, where the weight is determined based on (i) an n-gram parameter value that is derived from the n-gram language model and that corresponds to a particular n-gram, and (ii) an n-gram parameter value that is derived from the n-gram language model and that corresponds to an n-gram that is a sub-sequence within the particular n-gram. A log-linear language model having the determined n-gram features is generated, where the determined n-gram features in the log-linear language model have weights that are initialized based on the determined weights.
171 Citations
20 Claims
-
1. A method performed by one or more computers, the method comprising:
-
obtaining n-gram parameter values derived from an n-gram language model, the n-gram parameter values including n-gram parameter values for n-grams that include multiple words; determining n-gram features for a log-linear language model based on the n-grams corresponding to the obtained n-gram parameter values; determining a weight for each of the determined n-gram features, wherein for at least some of the n-gram features, the weight is determined based on (i) an n-gram parameter value that is derived from the n-gram language model and that corresponds to a particular n-gram of multiple words, and (ii) an n-gram parameter value that is derived from the n-gram language model and that corresponds to an n-gram that is a sub-sequence within the particular n-gram; generating a log-linear language model having the determined n-gram features, the determined n-gram features in the log-linear language model having weights that are initialized based on the determined weights; after generating the log-linear language model, training the log-linear language model to adjust the initialized weights; after training the log-linear language model, using the log-linear language model to determine a transcription for an utterance; and providing the transcription for the utterance. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 19, 20)
-
-
12. A system comprising:
one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising; obtaining n-gram parameter values derived from an n-gram language model, the n-gram parameter values including n-gram parameter values for n-grams that include multiple words; determining n-gram features for a log-linear language model based on the n-grams corresponding to the obtained n-gram parameter values; determining a weight for each of the determined n-gram features, wherein for at least some of the n-gram features, the weight is determined based on (i) an n-gram parameter value that is derived from the n-gram language model and that corresponds to a particular n-gram of multiple words, and (ii) an n-gram parameter value that is derived from the n-gram language model and that corresponds to an n-gram that is a sub-sequence within the particular n-gram; generating a log-linear language model having the determined n-gram features, the determined n-gram features in the log-linear language model having weights that are initialized based on the determined weights; after generating the log-linear language model, training the log-linear language model to adjust the initialized weights; after training the log-linear language model, using the log-linear language model to determine a transcription for an utterance; and providing the transcription for the utterance. - View Dependent Claims (13, 14, 15)
-
16. A non-transitory computer-readable medium storing a computer program, the program comprising instructions that when executed by one or more computers cause the one or more computers to perform operations comprising:
-
obtaining n-gram parameter values derived from an n-gram language model, the n-gram parameter values including n-gram parameter values for n-grams that include multiple words; determining n-gram features for a log-linear language model based on the n-grams corresponding to the obtained n-gram parameter values; determining a weight for each of the determined n-gram features, wherein for at least some of the n-gram features, the weight is determined based on (i) an n-gram parameter value that is derived from the n-gram language model and that corresponds to a particular n-gram of multiple words, and (ii) an n-gram parameter value that is derived from the n-gram language model and that corresponds to an n-gram that is a sub-sequence within the particular n-gram; generating a log-linear language model having the determined n-gram features, the determined n-gram features in the log-linear language model having weights that are initialized based on the determined weights; using the log-linear language model to determine a transcription for an utterance; and providing the transcription for the utterance. - View Dependent Claims (17, 18)
-
Specification