Please download the dossier by clicking on the dossier button x
×

Text classification and transformation based on author

  • US 10,083,157 B2
  • Filed: 08/05/2016
  • Issued: 09/25/2018
  • Est. Priority Date: 08/07/2015
  • Status: Active Grant
First Claim
Patent Images

1. A method performed by a system comprising one or more computers to generate an output text in a style of a requested author from an input text, wherein the output text and the input text are written in a same natural language, the system comprising an encoder language model and a decoder language model, wherein:

  • the encoder and decoder language model have been trained with text from multiple authors, the text from multiple authors comprising a plurality of training texts;

    as a result of training, the encoder language model stores data representing words occurring in the plurality of training texts from the multiple authors as respective vectors, wherein each vector represents a respective distribution of contexts in the plurality of training texts of a respective word from the plurality of training texts;

    as a result of training, the decoder language model (i) stores the distributions of contexts of words used by particular respective authors in the plurality of training texts and (ii) is configured to perform a transformation of a stream of vectors from the encoder language model to generate text in the natural language according to distributions of contexts of words used by a decoder author, the decoder author being one of the multiple authors;

    the encoder and decoder language model have been trained by performing the following operations for each of multiple training input texts each having a respective author;

    presenting each training input text to the encoder language model;

    receiving from the encoder language model a training vector stream representing the training input text, wherein the training vector stream includes vectors that are each (i) associated with a word from the input text and (ii) based on the distribution of contexts of the word in the plurality of training texts;

    presenting the training vector stream, an author of the training input text, and the training input text to the decoder language model;

    receiving a respective decoder output training text from the decoder language model based on the author, the training input text, and the training vector stream;

    comparing the decoder output of the decoder language model with an expected output for the author and the training input text, wherein the expected output is the training input text;

    if the comparing indicates a difference for a particular author, indicating an error; and

    in the case of an error, updating the decoder language model, including updating the decoder language model'"'"'s representation of vectors in the training vector stream, and back-propagating the error to the encoder language model, which updates a representation of the encoder language model;

    the method using the encoder language model and the decoder language model after training, the method comprising;

    receiving an input text including one or more words and a name of a requested author, wherein the requested author is one of the multiple authors;

    generating a vector stream of vectors by the encoder language model, each vector in the vector stream representing the distribution of contexts in which a respective word of the input text appears in training input texts; and

    producing an output text from the vector stream by the decoder language model according to the distributions of contexts of words used by the requested author, whereby the output text is a transformation of the input text to a style of the requested author.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×