STATISTICAL MACHINE TRANSLATION APPARATUS AND METHOD
First Claim
1. A statistical machine translation apparatus, comprising:
- a source language pre-processor configured to analyze morphemes of an input source language sentence and generating a resulting source language sentence, in which tags representing characteristics per morpheme are attached to the morphemes;
a target language pre-processor configured to analyze morphemes of an input target language sentence and generating a resulting target language sentence, in which tags representing characteristics per morpheme are attached to the morphemes;
a bilingual dictionary configured to store pairs of source and target language words having the same meaning; and
a translation model generator configured to generate a translation model for the source and target language sentences, using the bilingual dictionary.
1 Assignment
0 Petitions
Accused Products
Abstract
A statistical machine translation apparatus and method reflecting linguistic information are provided. In the process of generating a translation model based on statistical information on source language sentences and target language sentences during word alignment, the translation model is generated using word alignment results that are amended based on a bilingual dictionary. Further, instead of using the source language sentence and the target language sentence (i.e., their bilingual corpora) as materials to generate the translation model, it is determined whether or not the morphemes are meaningful content words in the source and target language sentences. Based on the determination, pre-processing is performed on the source language sentence and the target language sentence.
34 Citations
20 Claims
-
1. A statistical machine translation apparatus, comprising:
-
a source language pre-processor configured to analyze morphemes of an input source language sentence and generating a resulting source language sentence, in which tags representing characteristics per morpheme are attached to the morphemes; a target language pre-processor configured to analyze morphemes of an input target language sentence and generating a resulting target language sentence, in which tags representing characteristics per morpheme are attached to the morphemes; a bilingual dictionary configured to store pairs of source and target language words having the same meaning; and a translation model generator configured to generate a translation model for the source and target language sentences, using the bilingual dictionary. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A machine translation method, comprising:
-
pre-processing by a source language pre-processor the source language sentence by analyzing morphemes of an input source language sentence, and generating a resulting source language sentence, in which tags representing characteristics per morpheme are attached to the morphemes; pre-processing by a target language pre-processor the target language sentence by analyzing morphemes of an input target language sentence, and generating a resulting target language sentence, in which tags representing characteristics per morpheme are attached to the morphemes; and generating by a translation model generator a translation model of the source and target language sentences, using a bilingual dictionary storing pairs of source and target language words having the same meaning. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification