Methods and apparatus for generating dialog state conditioned language models
First Claim
1. A method for use in accordance with a dialog system, the method comprising:
- generating at least one language model, the at least one language model being conditioned on a state of dialog associated with the dialog system; and
storing the at least one language model for subsequent use in accordance with a speech recognizer associated with the dialog system;
wherein generating the at least one language model conditioned on a state of dialog associated with the dialog system further comprises;
dividing training data which is labeled by state into different state sets depending on the state to which the training data belongs;
clustering the state sets into clustered state sets, the clustering comprising combining training data belonging to states that are close to each other based on a distance measure;
building a separate language model for each of the clustered state sets to create a plurality of separate language models; and
building at least one interpolated model by interpolating one or more of the plurality of separate language models with a base model obtained from training data that includes at least some training data used to train all of the plurality of separate language models.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques are provided for generating improved language modeling. Such improved modeling is achieved by conditioning a language model on a state of a dialog for which the language model is employed. For example, the techniques of the invention may improve modeling of language for use in a speech recognizer of an automatic natural language based dialog system. Improved usability of the dialog system arises from better recognition of a user'"'"'s utterances by a speech recognizer, associated with the dialog system, using the dialog state-conditioned language models. By way of example, the state of the dialog may be quantified as: (i) the internal state of the natural language understanding part of the dialog system; or (ii) words in the prompt that the dialog system played to the user.
24 Citations
13 Claims
-
1. A method for use in accordance with a dialog system, the method comprising:
-
generating at least one language model, the at least one language model being conditioned on a state of dialog associated with the dialog system; and storing the at least one language model for subsequent use in accordance with a speech recognizer associated with the dialog system; wherein generating the at least one language model conditioned on a state of dialog associated with the dialog system further comprises; dividing training data which is labeled by state into different state sets depending on the state to which the training data belongs; clustering the state sets into clustered state sets, the clustering comprising combining training data belonging to states that are close to each other based on a distance measure; building a separate language model for each of the clustered state sets to create a plurality of separate language models; and building at least one interpolated model by interpolating one or more of the plurality of separate language models with a base model obtained from training data that includes at least some training data used to train all of the plurality of separate language models. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. Apparatus for use in accordance with a dialog system, the apparatus comprising:
-
at least one processor operative to generate at least one language model, the at least one language model being conditioned on a state of dialog associated with the dialog system; and memory, coupled to the at least one processor, for storing the at least one language model for subsequent use in accordance with a speech recognizer associated with the dialog system; wherein the operation of generating the at least one language model conditioned on a state of dialog associated with the dialog system further comprises;
(i) dividing training data which is labeled by state into different state sets depending on the state to which the training data belongs;
(ii) clustering the state sets into clustered state sets, the clustering comprising combining training data belonging to states that are close to each other based on a distance measure;
(iii) building a separate language model for each of the clustered state sets to create a plurality of separate language models; and
(iv) building at least one interpolated model by interpolating one or more of the plurality of separate language models with a base model obtained from training data that includes at least some training data used to train all of the plurality of separate language models. - View Dependent Claims (10, 11, 12)
-
-
13. At least one memory device storing instructions that, when executed by at least one processor, perform a method for use in accordance with a dialog system, the method comprising:
-
generating at least one language model, the at least one language model being conditioned on a state of dialog associated with the dialog system; and storing the at least one language model for subsequent use in accordance with a speech recognizer associated with the dialog system; wherein generating the at least one language model conditioned on a state of dialog associated with the dialog system further comprises; dividing training data which is labeled by state into different state sets depending on the state to which the training data belongs; clustering the state sets into clustered state sets, the clustering comprising combining training data belonging to states that are close to each other based on a distance measure; building a separate language model for each of the clustered state sets to create a plurality of separate language models; and building at least one interpolated model by interpolating one or more of the plurality of separate language models with a base model obtained from training data that includes at least some training data used to train all of the plurality of separate language models.
-
Specification