Systems and methods for providing metadata-dependent language models
First Claim
1. A method comprising:
- training, using at least one computer hardware processor to perform an automated two-stage training procedure having a first training stage and a second training stage different from the first training stage, an automatic speech recognition (ASR) engine at least in part by generating one or more language models for use as part of the ASR engine, the training comprising;
obtaining language data comprising training data and associated values for one or more metadata attributes, the language data comprising a plurality of instances of language data, an instance of language data comprising an instance of training data and one or more metadata attribute values associated with the instance of training data;
identifying, by processing the language data, a set of the one or more metadata attributes to use for clustering the instances of training data, the set of metadata attributes comprising first and second sets of metadata attributes;
performing the first training stage, comprising;
clustering the training data instances based on their respective values for the first set of metadata attributes to obtain a first plurality of clusters, the clustering comprising dividing the training data instances into the first plurality of clusters based on their respective values for the first set of metadata attributes; and
generating a respective language model for multiple clusters of the first plurality of clusters to obtain a plurality of language models, the generating comprising using training data in each of one or more of the multiple clusters to generate a respective language model in the plurality of language models;
performing the second training stage, comprising;
clustering the training data instances based on their respective values for the second set of metadata attributes to obtain a second plurality of clusters, the clustering comprising subdividing the training data instances in the first plurality of clusters based on their respective values for the second set of metadata attributes to obtain the second plurality of clusters; and
generating a first language model for a first cluster in the second plurality of clusters as a first weighted mixture of language models in the plurality of language models by estimating weights of the language models in the first weighted mixture using training data instances in the first cluster; and
storing the plurality of language models and the first language model for use as part of the ASR engine.
2 Assignments
0 Petitions
Accused Products
Abstract
Techniques for generating language models. The techniques include: obtaining language data comprising training data and associated values for one or more metadata attributes, the language data comprising a plurality of instances of language data, an instance of language data comprising an instance of training data and one or more metadata attribute values associated with the instance of training data; identifying, by processing the language data using at least one processor, a set of one or more of the metadata attributes to use for clustering the instances of training data into a plurality of clusters; clustering the training data instances based on their respective values for the identified set of metadata attributes into the plurality of clusters; and generating a language model for each of the plurality of clusters.
18 Citations
20 Claims
-
1. A method comprising:
-
training, using at least one computer hardware processor to perform an automated two-stage training procedure having a first training stage and a second training stage different from the first training stage, an automatic speech recognition (ASR) engine at least in part by generating one or more language models for use as part of the ASR engine, the training comprising; obtaining language data comprising training data and associated values for one or more metadata attributes, the language data comprising a plurality of instances of language data, an instance of language data comprising an instance of training data and one or more metadata attribute values associated with the instance of training data; identifying, by processing the language data, a set of the one or more metadata attributes to use for clustering the instances of training data, the set of metadata attributes comprising first and second sets of metadata attributes; performing the first training stage, comprising; clustering the training data instances based on their respective values for the first set of metadata attributes to obtain a first plurality of clusters, the clustering comprising dividing the training data instances into the first plurality of clusters based on their respective values for the first set of metadata attributes; and generating a respective language model for multiple clusters of the first plurality of clusters to obtain a plurality of language models, the generating comprising using training data in each of one or more of the multiple clusters to generate a respective language model in the plurality of language models; performing the second training stage, comprising; clustering the training data instances based on their respective values for the second set of metadata attributes to obtain a second plurality of clusters, the clustering comprising subdividing the training data instances in the first plurality of clusters based on their respective values for the second set of metadata attributes to obtain the second plurality of clusters; and generating a first language model for a first cluster in the second plurality of clusters as a first weighted mixture of language models in the plurality of language models by estimating weights of the language models in the first weighted mixture using training data instances in the first cluster; and storing the plurality of language models and the first language model for use as part of the ASR engine. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system comprising:
at least one processor configured to perform acts of; training, using an automated two-stage training procedure having a first training stage and a second training stage different from the first training stage, an automatic speech recognition (ASR) engine at least in part by generating one or more language models for use as part of the ASR engine, the training comprising; obtaining language data comprising training data and associated values for one or more metadata attributes, the language data comprising a plurality of instances of language data, an instance of language data comprising an instance of training data and one or more metadata attribute values associated with the instance of training data; identifying, by processing the language data, a set of the one or more metadata attributes to use for clustering the instances of training data, the set of metadata attributes comprising first and second sets of metadata attributes; performing the first training stage, comprising; clustering the training data instances based on their respective values for the first set of metadata attributes to obtain a first plurality of clusters, the clustering comprising dividing the training data instances into the first plurality of clusters based on their respective values for the first set of metadata attributes; and generating a respective language model for multiple of the first plurality of clusters to obtain a plurality of language models, the generating comprising using training data in each of one or more of the multiple clusters to generate a respective language model in the plurality of language models; performing the second training stage, comprising; clustering the training data instances based on their respective values for the second set of metadata attributes to obtain a second plurality of clusters, the clustering comprising subdividing the training data instances in the first plurality of clusters based on their respective values for the second set of metadata attributes to obtain the second plurality of clusters; and generating a first language model for a first cluster in the second plurality of clusters as a first weighted mixture of language models in the plurality of language models by estimating weights of the language models in the first weighted mixture using training data instances in the first cluster; and storing the plurality of language models and the first language model for use as part of the ASR engine. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
15. At least one non-transitory computer-readable storage medium storing processor-executable instructions that, when executed by at least one processor, cause the at least one processor to perform a method comprising acts of:
-
training, using an automated two-stage training procedure having a first training stage and a second training stage different from the first training stage, an automatic speech recognition (ASR) engine at least in part by generating one or more language models for use as part of the ASR engine, the training comprising; obtaining language data comprising training data and associated values for one or more metadata attributes, the language data comprising a plurality of instances of language data, an instance of language data comprising an instance of training data and one or more metadata attribute values associated with the instance of training data; identifying, by processing the language data, a set of the one or more metadata attributes to use for clustering the instances of training data, the set of metadata attributes comprising first and second sets of metadata attributes; performing the first stage, comprising; clustering the training data instances based on their respective values for the first set of metadata attributes to obtain a first plurality of clusters, the clustering comprising dividing the training data instances into the first plurality of clusters based on their respective values for the first set of metadata attributes; and generating a respective language model for multiple of the first plurality of clusters to obtain a plurality of language models, the generating comprising using the training data in each of the one or more of the multiple clusters to generate a respective language model in the plurality of language models; performing the second stage, comprising; clustering the training data instances based on their respective values for the second set of metadata attributes to obtain a second plurality of clusters, the clustering comprising subdividing the training data instances in the first plurality of clusters based on their respective values for the second set of metadata attributes to obtain the second plurality of clusters; and generating a first language model for a first cluster in the second plurality of clusters as a first weighted mixture of language models in the plurality of language models by estimating weights of the language models in the first weighted mixture using training data instances in the first cluster; and storing the plurality of language models and the first language model for use as part of the ASR engine. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification