Please download the dossier by clicking on the dossier button x
×

Multi-domain machine translation system with training data clustering and dynamic domain adaptation

  • US 10,437,933 B1
  • Filed: 08/16/2016
  • Issued: 10/08/2019
  • Est. Priority Date: 08/16/2016
  • Status: Active Grant
First Claim
Patent Images

1. An apparatus, comprising:

  • one or more-processors; and

    one or more non-transitory computer-readable storage media having instructions stored thereupon which are executable by the one or more processors and which, when executed, cause the apparatus to;

    identify a plurality of domains in general training data comprising client training data and background training data, the client training data and the background training data being expressed in a source language and a target language;

    assign segments in the general training data to one of the plurality of domains to create domain-specific training data for the plurality of domains;

    generate a domain-specific translation model for the plurality of domains using the domain-specific training data;

    generate a domain-specific language model for the plurality of domains using the domain-specific training data;

    extract domain-specific tuning data from the domain-specific training data;

    generate candidate translations using the domain-specific tuning data, the domain-specific language models, and the domain-specific translation models;

    determine feature scores corresponding to individual ones of the domain-specific language models and the domain-specific translation models based at least in part on the candidate translations;

    learn, based at least in part on the domain-specific tuning data and the feature scores, domain-specific model weights associated with the feature scores;

    generate a translation package comprising the domain-specific language models, the domain-specific translation models, and the domain-specific model weights;

    receive a request to translate an input segment in the source language into the target language;

    identify individual domain-specific model weights to be utilized to translate the input segment;

    identify one or more phrases associated with the input segment; and

    translate the input segment into the target language by selecting, based at least in part on the individual domain-specific model weights, one or more candidate translations of the candidate translations corresponding to the one or more phrases of the input segment.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×