Language model adaptation using semantic supervision
First Claim
Patent Images
1. A method of adapting an n-gram language model for a new domain, the method comprising:
- receiving background data indicative of general text phrases not directed to the new domain;
receiving a set of semantic entities used in the new domain and organized in classes;
generating background n-gram class count data based on the background data and the semantic entities and classes thereof;
receiving adaptation data indicative of text phrases used in the new domain;
generating adaptation n-gram class count data based on the adaptation data and the semantic entities and classes thereof;
training a language model based on the background n-gram class count data and the adaptation n-gram class count data; and
embodying the language model in tangible form.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus are provided for adapting a language model. The method and apparatus provide supervised class-based adaptation of the language model utilizing in-domain semantic information.
-
Citations
13 Claims
-
1. A method of adapting an n-gram language model for a new domain, the method comprising:
-
receiving background data indicative of general text phrases not directed to the new domain; receiving a set of semantic entities used in the new domain and organized in classes; generating background n-gram class count data based on the background data and the semantic entities and classes thereof; receiving adaptation data indicative of text phrases used in the new domain; generating adaptation n-gram class count data based on the adaptation data and the semantic entities and classes thereof; training a language model based on the background n-gram class count data and the adaptation n-gram class count data; and embodying the language model in tangible form. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer-readable storage medium having computer-executable instructions for performing steps to generate a language model, the steps comprising:
-
receiving a set of semantic entities used in a selected domain and organized in classes; receiving background n-grams class count data correlated to classes of the set of semantic entities and based on background data indicative of general text; receiving adaptation n-gram class count data correlated to classes of the set of semantic entities and based on adaptation data indicative of a selected domain to be modeled; training a language model based on the background n-gram class count data, the adaptation n-gram class count data and the set of semantic entities; and wherein training the language model comprises computing background word count data based on the background n-gram class count data and the set of semantic entities, computing adaptation word count data based on the adaptation n-gram class count data and the set of semantic entities, and smoothing the n-gram relative frequencies. - View Dependent Claims (13)
-
Specification