SYSTEMS AND METHODS FOR PROVIDING METADATA-DEPENDENT LANGUAGE MODELS
First Claim
1. A method for generating a plurality of language models, the method comprising acts of:
- (A) obtaining language data comprising training data and associated values for one or more metadata attributes, the language data comprising a plurality of instances of language data, an instance of language data comprising an instance of training data and one or more metadata attribute values associated with the instance of training data;
(B) identifying, by processing the language data using at least one processor, a set of one or more of the metadata attributes to use for clustering the instances of training data into a plurality of clusters;
(C) clustering the training data instances based on their respective values for the identified set of metadata attributes into the plurality of clusters; and
(D) generating a language model for each of the plurality of clusters.
3 Assignments
0 Petitions
Accused Products
Abstract
Techniques for generating language models. The techniques include: obtaining language data comprising training data and associated values for one or more metadata attributes, the language data comprising a plurality of instances of language data, an instance of language data comprising an instance of training data and one or more metadata attribute values associated with the instance of training data; identifying, by processing the language data using at least one processor, a set of one or more of the metadata attributes to use for clustering the instances of training data into a plurality of clusters; clustering the training data instances based on their respective values for the identified set of metadata attributes into the plurality of clusters; and generating a language model for each of the plurality of clusters.
42 Citations
20 Claims
-
1. A method for generating a plurality of language models, the method comprising acts of:
-
(A) obtaining language data comprising training data and associated values for one or more metadata attributes, the language data comprising a plurality of instances of language data, an instance of language data comprising an instance of training data and one or more metadata attribute values associated with the instance of training data; (B) identifying, by processing the language data using at least one processor, a set of one or more of the metadata attributes to use for clustering the instances of training data into a plurality of clusters; (C) clustering the training data instances based on their respective values for the identified set of metadata attributes into the plurality of clusters; and (D) generating a language model for each of the plurality of clusters. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system comprising:
at least one processor configured to perform acts of; (A) obtaining language data comprising training data and associated values for one or more metadata attributes, the language data comprising a plurality of instances of language data, an instance of language data comprising an instance of training data and one or more metadata attribute values associated with the instance of training data; (B) identifying, by processing the language data, a set of one or more of the metadata attributes to use for clustering the instances of training data into a plurality of clusters; (C) clustering the training data instances based on their respective values for the identified set of metadata attributes into the plurality of clusters; and (D) generating a language model for each of the plurality of clusters. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
15. At least one non-transitory computer-readable storage medium storing processor-executable instructions that, when executed by at least one processor, cause the at least one processor to perform a method comprising acts of:
-
(A) obtaining language data comprising training data and associated values for one or more metadata attributes, the language data comprising a plurality of instances of language data, an instance of language data comprising an instance of training data and one or more metadata attribute values associated with the instance of training data; (B) identifying, by processing the language data, a set of one or more of the metadata attributes to use for clustering the instances of training data into a plurality of clusters; (C) clustering the training data instances based on their respective values for the identified set of metadata attributes into the plurality of clusters; and (D) generating a language model for each of the plurality of clusters. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification