×

System and method for incrementally updating a reordering model for a statistical machine translation system

  • US 9,442,922 B2
  • Filed: 11/18/2014
  • Issued: 09/13/2016
  • Est. Priority Date: 11/18/2014
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method for updating a reordering model of a statistical machine translation system comprising:

  • at a first time, receiving new training data for retraining an existing statistical machine translation system, the new training data comprising at least one sentence pair, each of the at least one sentence pair comprising a source sentence in a source language and a target sentence in a target language;

    extracting phrase pairs from the new training data, each phrase pair including a source language phrase and a target language phrase;

    generating a new reordering file from the extracted phrase pairs, the new reordering file including a set of the phrase pairs extracted from the new training data;

    updating a reordering model of the existing statistical machine translation system based on the new reordering file, the reordering model including a reordering table, the reordering table comprising phrase pairs and a set of features, the set of features comprising, for each of a set of orientation types, at least one feature which is a function of a count of the orientation type for the respective phrase pair, each phrase pair in the reordering table occurring only once, and wherein the updating of the reordering model includes merging an existing reordering table with the new reordering file or merging the existing reordering table with a new reordering table generated from the new reordering file, the merging including updating feature scores for each of the orientation types for at least some of the phrase pairs based on the counts stored in the existing reordering table;

    at a second time after the first time, receiving new training data for training the existing statistical machine translation system, the new training data comprising at least one sentence pair, the sentence pair comprising a source sentence in the source language and a target sentence in the target language; and

    reiterating the extracting of phrase pairs, generating of the new reordering file and the updating the reordering model based on the new training data received at the second time,wherein at least one of the extracting phrase pairs, generating the new reordering file, and updating the reordering model is performed with a computer processor.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×