Chunk-based statistical machine translation system
First Claim
1. A translation method, comprising the steps of:
- receiving an input sentence;
chunking the input sentence into one or more chunks;
translating the chunks; and
decoding the translated chunks to generate an output sentence.
3 Assignments
0 Petitions
Accused Products
Abstract
Traditional statistical machine translation systems learn all information from a sentence aligned parallel text and are known to have problems translating between structurally diverse languages. To overcome this limitation, the present invention introduces two-level training, which incorporates syntactic chunking into statistical translation. A chunk-alignment step is inserted between the sentence-level and word-level training, which allows differing training for these two sources of information in order to learn lexical properties from the aligned chunks and learn structural properties from chunk sequences. The system consists of a linguistic processing step, two level training, and a decoding step which combines chunk translations of multiple sources and multiple language models.
-
Citations
22 Claims
-
1. A translation method, comprising the steps of:
-
receiving an input sentence; chunking the input sentence into one or more chunks; translating the chunks; and decoding the translated chunks to generate an output sentence. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21)
-
-
22. A translation method, comprising the steps of:
-
receiving an input sentence; chunking the input sentence into one or more chunks using chunk rules; translating the chunks using a direct chunk translation table and statistical translation model; reordering the chunks using multiple language models, one or more search methods, and a chunk head language model; and decoding the reordered chunks to generate an output sentence, using multiple language models and a chunk head language model.
-
Specification