Computerized statistical machine translation with phrasal decoder
First Claim
1. A computerized system for performing statistical machine translation, the system comprising:
- a statistical machine translation engine executed on a user computing device, the statistical machine translation engine trained on a bilingual parallel corpus including source language documents and a corresponding target human translation of the source language documents, and configured to receive a translation input and to produce a raw machine translation output, at run-time;
a phrasal decoder, separate and distinct from the statistical machine translation engine, executed on the user computing device, the phrasal decoder being trained prior to run-time on a monolingual parallel corpus, the monolingual parallel corpus including a machine translation output of the source language documents of the bilingual parallel corpus and the corresponding target human translation output of the source language documents of the bilingual parallel corpus, to thereby learn mappings and build a phrase table by establishing phrase pairs between the machine translation output and the target human translation output, wherein the machine translation output is unedited by human translators, and wherein the phrasal decoder is trained prior to run-time on a developer computing device on which the bilingual parallel corpus is stored, assigning to each phrase pair a statistical score representing a utility of each phrase pair; and
wherein at run-time on the user computing device the phrasal decoder is configured to process the raw machine translation output, and to produce a corrected translation output based on the learned mappings and the phrase table, programmatically correcting the raw machine translation output if a statistical score for correspondence of the phrase pair is above a predetermined threshold.
2 Assignments
0 Petitions
Accused Products
Abstract
A computerized system for performing statistical machine translation with a phrasal decoder is provided. The system may include a phrasal decoder trained prior to run-time on a monolingual parallel corpus, the monolingual parallel corpus including a machine translation output of source language documents of a bilingual parallel corpus and a corresponding target human translation output of the source language documents, to thereby learn mappings between the machine translation output and the target human translation output. The system may further include a statistical machine translation engine configured to receive a translation input and to produce a raw machine translation output, at run-time. The phrasal decoder may be configured to process the raw machine translation output, and to produce a corrected translation output based on the learned mappings for display on a display associated with the system.
75 Citations
18 Claims
-
1. A computerized system for performing statistical machine translation, the system comprising:
-
a statistical machine translation engine executed on a user computing device, the statistical machine translation engine trained on a bilingual parallel corpus including source language documents and a corresponding target human translation of the source language documents, and configured to receive a translation input and to produce a raw machine translation output, at run-time; a phrasal decoder, separate and distinct from the statistical machine translation engine, executed on the user computing device, the phrasal decoder being trained prior to run-time on a monolingual parallel corpus, the monolingual parallel corpus including a machine translation output of the source language documents of the bilingual parallel corpus and the corresponding target human translation output of the source language documents of the bilingual parallel corpus, to thereby learn mappings and build a phrase table by establishing phrase pairs between the machine translation output and the target human translation output, wherein the machine translation output is unedited by human translators, and wherein the phrasal decoder is trained prior to run-time on a developer computing device on which the bilingual parallel corpus is stored, assigning to each phrase pair a statistical score representing a utility of each phrase pair; and wherein at run-time on the user computing device the phrasal decoder is configured to process the raw machine translation output, and to produce a corrected translation output based on the learned mappings and the phrase table, programmatically correcting the raw machine translation output if a statistical score for correspondence of the phrase pair is above a predetermined threshold. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computerized method of statistical machine translation, the method comprising:
-
training a statistical machine translation engine on a bilingual parallel corpus including source language documents and a corresponding target human translation of the source language documents; training a phrasal decoder, separate and distinct from the statistical machine translation engine, on a monolingual parallel corpus, the monolingual parallel corpus including a machine translation output of the source language documents of the bilingual parallel corpus and the corresponding target human translation output of the source language documents of the bilingual parallel corpus, to thereby learn mappings and build a phrase table by establishing phrase pairs between the machine translation output and the target human translation output, wherein the machine translation output is unedited by human translators, assigning to each phrase pair a statistical score representing a utility of each phrase pair; performing statistical machine translation via the statistical machine translation engine trained on the bilingual parallel corpus of a translation input to thereby produce a raw machine translation output; and processing the raw machine translation output to thereby produce a corrected translation output based on the learned mappings and the phrase table, programmatically correcting the raw machine translation output if a statistical score for correspondence of the phrase pair is above a predetermined threshold. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A computerized system for performing statistical machine translation, the system comprising:
a user computing device configured to execute; a statistical machine translation engine trained on a bilingual parallel corpus including source language documents and a corresponding target human translation of the source language documents and configured to receive a translation input and to produce a raw machine translation output, at run-time; and a phrasal decoder, separate and distinct from the statistical machine translation engine, configured at run-time to process the raw machine translation output, and to produce a corrected translation output for display on a display associated with the system; wherein the phrasal decoder, separate and distinct from the statistical machine translation engine, is trained on a developer computing device prior to run-time based on a monolingual parallel corpus, the monolingual parallel corpus including a machine translation output of the bilingual parallel corpus and the corresponding target human translation output included within the bilingual parallel corpus, the machine translation output being unedited by human translators, to thereby learn mappings and build a phrase table by establishing phrase pairs between the machine translation output and the target human translation output, assigning to each phrase pair a statistical score representing a utility of each phrase pair, and wherein at run time on the user computing device the corrected translation output is produced based on a plurality of the learned mappings and the phrase table, programmatically correcting the raw machine translation output if a statistical score for correspondence of the phrase pair is above a predetermined threshold.
Specification