SYSTEM AND METHOD FOR INCREMENTALLY UPDATING A REORDERING MODEL FOR A STATISTICAL MACHINE TRANSLATION SYSTEM
First Claim
1. A method for updating a reordering model of a statistical machine translation system comprising:
- at a first time, receiving new training data for retraining an existing statistical machine translation system, the new training data comprising at least one sentence pair, each of the at least one sentence pair comprising a source sentence in a source language and a target sentence in a target language;
extracting phrase pairs from the new training data, each phrase pair including a source language phrase and a target language phrase;
generating a new reordering file from the extracted phrase pairs;
updating a reordering model of the existing statistical machine translation system based on the new reordering file, the reordering model including a reordering table;
at a second time after the first time, receiving new training data for training the existing statistical machine translation system, the new training data comprising at least one sentence pair, the sentence pair comprising a source sentence in the source language and a target sentence in the target language; and
reiterating the extracting of phrase pairs, generating of the new reordering file and the updating the reordering model based on the new training data received at the second time,wherein at least one of the extracting phrase pairs, generating the new reordering file, and updating the reordering model is performed with a computer processor.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for updating a reordering model of a statistical machine translation system includes, at a first time, receiving new training data for retraining an existing statistical machine translation system, the new training data including at least one sentence pair, each pair including a source sentence in a source language and a target sentence in a target language. Phrase pairs are extracted from the new training data and used to generate a new reordering file. A reordering model of the existing statistical machine translation system is updated, based on the new reordering file. The reordering model includes a reordering table. At a second time after the first time, new training data is received. The extracting of phrase pairs, generating of the new reordering file and the updating the reordering model is reiterated, based on the new training data received at the second time.
20 Citations
21 Claims
-
1. A method for updating a reordering model of a statistical machine translation system comprising:
-
at a first time, receiving new training data for retraining an existing statistical machine translation system, the new training data comprising at least one sentence pair, each of the at least one sentence pair comprising a source sentence in a source language and a target sentence in a target language; extracting phrase pairs from the new training data, each phrase pair including a source language phrase and a target language phrase; generating a new reordering file from the extracted phrase pairs; updating a reordering model of the existing statistical machine translation system based on the new reordering file, the reordering model including a reordering table; at a second time after the first time, receiving new training data for training the existing statistical machine translation system, the new training data comprising at least one sentence pair, the sentence pair comprising a source sentence in the source language and a target sentence in the target language; and reiterating the extracting of phrase pairs, generating of the new reordering file and the updating the reordering model based on the new training data received at the second time, wherein at least one of the extracting phrase pairs, generating the new reordering file, and updating the reordering model is performed with a computer processor. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A system for updating a reordering model of a statistical machine translation system comprising:
-
a phrase pair extraction component which, at each of a plurality of times, extracts phrase pairs from new training data, the new training data comprising at least one sentence pair, each of the at least one sentence pair comprising a source sentence in a source language and a target sentence in a target language, each phrase pair including a source language phrase and a target language phrase; a reordering file generation component which, at each of the plurality of times, generates a new reordering file, the new reordering file including only phrase pairs extracted from the new training data; an update component which, at each of the plurality of times, updates a reordering model of an existing statistical machine translation system based on the new reordering file, the reordering model including a reordering table; and a processor which implements the phrase pair extraction component, reordering file generation component, and update component.
-
-
21. A method for updating a reordering model of a statistical machine translation system comprising:
-
at a first time, receiving new training data, the new training data comprising sentence pairs, each of the sentence pairs comprising a source sentence in a source language and a target sentence in a target language; extracting phrase pairs from the new training data, each phrase pair including a source language phrase and a target language phrase; generating a new reordering file from the extracted phrase pairs, the new reordering file includes only phrase pairs extracted from the new training data and their associated orientation types; updating a reordering model of the existing statistical machine translation system based on the new reordering file and an existing reordering table of the reordering model, the updating including accumulating counts of the extracted phrase pairs and phrase pairs in the existing reordering table; and repeating the receiving new training data, extracting of phrase pairs, generating of the new reordering file, and the updating the reordering model at least once at a subsequent time, wherein at least one of the extracting phrase pairs, generating the new reordering file, and updating the reordering model is performed with a processor.
-
Specification