Enhanced maximum entropy models
First Claim
1. A method performed by one or more computers, the method comprising:
- receiving, by the one or more computers, data indicating a candidate transcription for an utterance and a particular context for the utterance;
obtaining, by the one or more computers, a maximum entropy language model that includes (i) scores for one or more n-gram features that each correspond to a respective n-gram and (ii) scores for one or more backoff features that each correspond to a set of n-grams for which there are no corresponding n-gram features in the maximum entropy language model;
determining, by the one or more computers, based on the candidate transcription and the particular context, a feature value for (i) each of the one or more n-gram features of the maximum entropy language model and (ii) each of the one or more backoff features of the maximum entropy language model;
inputting, by the one or more computers, the feature values for the n-gram features and the feature values for the backoff features to the maximum entropy language model; and
receiving, by the one or more computers, from the maximum entropy language model, an output indicative of a likelihood of occurrence of the candidate transcription;
selecting, by the one or more computers, based on the output of the maximum entropy language model, a transcription for the utterance from among a plurality of candidate transcriptions; and
providing, by the one or more computers, the selected transcription to a client device.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, relating to enhanced maximum entropy models. In some implementations, data indicating a candidate transcription for an utterance and a particular context for the utterance are received. A maximum entropy language model is obtained. Feature values are determined for n-gram features and backoff features of the maximum entropy language model. The feature values are input to the maximum entropy language model, and an output is received from the maximum entropy language model. A transcription for the utterance is selected from among a plurality of candidate transcriptions based on the output from the maximum entropy language model. The selected transcription is provided to a client device.
187 Citations
20 Claims
-
1. A method performed by one or more computers, the method comprising:
-
receiving, by the one or more computers, data indicating a candidate transcription for an utterance and a particular context for the utterance; obtaining, by the one or more computers, a maximum entropy language model that includes (i) scores for one or more n-gram features that each correspond to a respective n-gram and (ii) scores for one or more backoff features that each correspond to a set of n-grams for which there are no corresponding n-gram features in the maximum entropy language model; determining, by the one or more computers, based on the candidate transcription and the particular context, a feature value for (i) each of the one or more n-gram features of the maximum entropy language model and (ii) each of the one or more backoff features of the maximum entropy language model; inputting, by the one or more computers, the feature values for the n-gram features and the feature values for the backoff features to the maximum entropy language model; and receiving, by the one or more computers, from the maximum entropy language model, an output indicative of a likelihood of occurrence of the candidate transcription; selecting, by the one or more computers, based on the output of the maximum entropy language model, a transcription for the utterance from among a plurality of candidate transcriptions; and providing, by the one or more computers, the selected transcription to a client device. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A system comprising:
one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising; receiving, by the one or more computers, data indicating a candidate transcription for an utterance and a particular context for the utterance; obtaining, by the one or more computers, a maximum entropy language model that includes (i) scores for one or more n-gram features that each correspond to a respective n-gram and (ii) scores for one or more backoff features that each correspond to a set of n-grams for which there are no corresponding n-gram features in the maximum entropy language model; determining, by the one or more computers, based on the candidate transcription and the particular context, a feature value for (i) each of the one or more n-gram features of the maximum entropy language model and (ii) each of the one or more backoff features of the maximum entropy language model; inputting, by the one or more computers, the feature values for the n-gram features and the feature values for the backoff features to the maximum entropy language model; and receiving, by the one or more computers, from the maximum entropy language model, an output indicative of a likelihood of occurrence of the candidate transcription; selecting, by the one or more computers, based on the output of the maximum entropy language model, a transcription for the utterance from among a plurality of candidate transcriptions; and providing, by the one or more computers, the selected transcription to a client device. - View Dependent Claims (16, 17)
-
18. A non-transitory computer-readable data storage device storing a computer program, the program comprising instructions that when executed by one or more computers cause the one or more computers to perform operations comprising:
-
receiving, by the one or more computers, data indicating a candidate transcription for an utterance and a particular context for the utterance; obtaining, by the one or more computers, a maximum entropy language model that includes (i) scores for one or more n-gram features that each correspond to a respective n-gram and (ii) scores for one or more backoff features that each correspond to a set of n-grams for which there are no corresponding n-gram features in the maximum entropy language model; determining, by the one or more computers, based on the candidate transcription and the particular context, a feature value for (i) each of the one or more n-gram features of the maximum entropy language model and (ii) each of the one or more backoff features of the maximum entropy language model; inputting, by the one or more computers, the feature values for the n-gram features and the feature values for the backoff features to the maximum entropy language model; and receiving, by the one or more computers, from the maximum entropy language model, an output indicative of a likelihood of occurrence of the candidate transcription; selecting, by the one or more computers, based on the output of the maximum entropy language model, a transcription for the utterance from among a plurality of candidate transcriptions; and providing, by the one or more computers, the selected transcription to a client device. - View Dependent Claims (19, 20)
-
Specification