Natural language generation through character-based recurrent neural networks with finite-state prior knowledge
First Claim
1. A method comprising:
- building a target background model using words occurring in training data, the target background model being adaptable to accept subsequences of an input semantic representation, the training data includes training pairs, each training pair including a semantic representation and a corresponding reference sequence in a natural language;
receiving human-generated utterances in the form of speech or text;
predicting a current dialog state of a natural language dialog between a virtual agent and a user, based on the utterances;
generating a semantic representation of a next utterance, based on the current dialog state, the semantic representation including a sequence of characters; and
generating a target sequence in a natural language from the semantic representation, comprising;
after generating the semantic representation, adapting the target background model to form an adapted background model, which accepts all subsequences of the semantic representation;
representing the semantic representation as a sequence of character embeddings;
with an encoder, encoding the character embeddings to generate a set of character representations; and
with a decoder, generating a target sequence of characters, based on the set of character representations, wherein at a plurality of time steps, a next character in the target sequence is a function of a previously generated character of the target sequence and the adapted background model; and
outputting the target sequence.
6 Assignments
0 Petitions
Accused Products
Abstract
A method and a system for generating a target character sequence from a semantic representation including a sequence of characters are provided. The method includes adapting a target background model, built from a vocabulary of words, to form an adapted background model. The adapted background model accepts subsequences of an input semantic representation as well as words from the vocabulary. The input semantic representation is represented as a sequence of character embeddings, which are input to an encoder. The encoder encodes each of the character embeddings to generate a respective character representation. A decoder then generates a target sequence of characters, based on the set of character representations. At a plurality of time steps, a next character in the target sequence is selected as a function of a previously generated character(s) of the target sequence and the adapted background model.
32 Citations
20 Claims
-
1. A method comprising:
-
building a target background model using words occurring in training data, the target background model being adaptable to accept subsequences of an input semantic representation, the training data includes training pairs, each training pair including a semantic representation and a corresponding reference sequence in a natural language; receiving human-generated utterances in the form of speech or text; predicting a current dialog state of a natural language dialog between a virtual agent and a user, based on the utterances; generating a semantic representation of a next utterance, based on the current dialog state, the semantic representation including a sequence of characters; and generating a target sequence in a natural language from the semantic representation, comprising; after generating the semantic representation, adapting the target background model to form an adapted background model, which accepts all subsequences of the semantic representation; representing the semantic representation as a sequence of character embeddings; with an encoder, encoding the character embeddings to generate a set of character representations; and with a decoder, generating a target sequence of characters, based on the set of character representations, wherein at a plurality of time steps, a next character in the target sequence is a function of a previously generated character of the target sequence and the adapted background model; and outputting the target sequence. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A method comprising:
-
training a hierarchical model with training data, the training data including training pairs, each training pair including a semantic representation and a corresponding reference sequence in a natural language, to minimize, over the training data, an error between a realization output by the model and the reference sequence assigned to the training sample, the hierarchical model including an encoder, a decoder, and a target background model, built from a vocabulary of words; receiving a human-generated utterance in a natural language; predicting a current dialog state based on the utterance; generating a semantic representation of a next utterance, based on the current dialog state, the semantic representation including a sequence of characters; and generating a target sequence in a natural language from the semantic representation, comprising; adapting the target background model to form an adapted background model, which accepts subsequences of the sequence of characters in the semantic representation; representing the semantic representation as a sequence of character embeddings; with the encoder, encoding the character embeddings to generate a set of character representations; and with the decoder, generating a target sequence of characters, based on the set of character representations, wherein at a plurality of time steps, a next character in the target sequence is a function of a previously generated character of the target sequence and the adapted background model; and outputting the target sequence, wherein the decoder includes an adaptor part and a background part, the adaptor part defining a first conditional probability distribution for sampling a next character in the target sequence, given the already generated characters, the background part defining a second conditional probability distribution for sampling the next character in the target sequence, given the already generated characters, which is based on the adapted background model, an overall conditional probability distribution for sampling the next character being a function of the first and second conditional probability distributions. - View Dependent Claims (17)
-
-
18. A dialog system comprising:
-
a dialog state tracker which predicts a current dialog state based on received human-generated utterances in the form of speech or text; a dialog manager which generates a semantic representation of a next utterance, based on the current dialog state, the semantic representation including a sequence of characters; and a system for generation of a target sequence from the semantic representation comprising; a hierarchical character sequence-to-character sequence model; a character embedding component which represents the semantic representation as a sequence of character embeddings; a natural language generation component which inputs the sequence of character embeddings into the hierarchical model; an output component which outputs the target sequence; a learning component for training the hierarchical model with training data, the training data including training pairs, each training pair including a semantic representation and a corresponding reference sequence in a natural language, to minimize, over the training data, an error between a realization output by the model and the reference sequence assigned to the training sample; and a processor which implements the learning component, dialog state tracker, dialog manager, character embedding component, natural language generation component, and output component; the hierarchical model comprising; a target background model, built from a vocabulary of words, which is adapted by copying subsequences of the generated semantic representation, to form an adapted background model, which accepts subsequences of the input semantic representation; an encoder, which encodes the character embeddings to generate a set of character representations; and a decoder, which generates a target sequence of characters, based on the set of character representations, wherein at a plurality of time steps, a next character in the target sequence is a function of a previously generated character of the target sequence, and the adapted background model.
-
-
19. A method for generating a system for generation of a target sequence of characters from an input semantic representation, the method comprising:
-
providing training data which includes training pairs, each training pair including a semantic representation and a corresponding reference sequence in a natural language; building a target background model using words occurring in the training data, the target background model being adaptable to accept subsequences of an input semantic representation; incorporating the target background model into a hierarchical model which includes an encoder and a decoder, the encoder and decoder each operating at the character level, such that the background model, when adapted to accept subsequences of the input semantic representation, biases the decoder towards outputting a target character sequence comprising at least one of; words occurring in the training data, and subsequences of the input semantic representation; and training the hierarchical model on the training pairs to output a target sequence from an input semantic representation, wherein the decoder includes an adaptor part and a background part, the adaptor part defining a first conditional probability distribution for sampling a next character in the target sequence, given the already generated characters, the background part defining a second conditional probability distribution for sampling the next character in the target sequence, given the already generated characters, which is based on the adapted background model, an overall conditional probability distribution for sampling the next character being a function of the first and second conditional probability distributions, and wherein at least one of the building of the target background model, incorporating the target background model, and training the hierarchical model is performed with a processor. - View Dependent Claims (20)
-
Specification