Syntax-based statistical translation model
First Claim
Patent Images
1. A method for translating natural languages using a statistical translation system, the method comprising:
- parsing a first string in a first language into a parse tree using a statistical parser included in the statistical machine translation system, the parse tree including a plurality of nodes, one or more of said nodes including one or more leafs, each leaf including a first word in the first language, the nodes including child nodes having labels;
determining a plurality of possible reorderings of one or more of said child nodes including one or more of the leafs using the statistical translation system, the reordering performed in response to a probability corresponding to a sequence of the child node labels;
determining a probability between 0.0000% and 100.0000%, non-inclusive, of the possible reorderings by the statistical translation system;
determining a plurality of possible insertions of one or more words at one or more of said nodes using the statistical translation system;
determining a probability between 0.0000% and 100.0000%, non-inclusive, of the possible insertions of one or more words at one or more of said nodes by the statistical translation system;
translating the first word at each leaf into a second word corresponding to a possible translation in a second language using the statistical translation system; and
determining a total probability between 0.0000% and 100.0000%, non-inclusive, based on the reordering, the inserting, and the translating by the statistical translation system.
1 Assignment
0 Petitions
Accused Products
Abstract
A statistical translation model (TM) may receive a parse tree in a source language as an input and separately output a string in a target language. The TM may perform channel operations on the parse tree using model parameters stored in probability tables. The channel operations may include reordering child nodes, inserting extra words at each node (e.g., NULL words) translating leaf words, and reading off leaf words to generate the string in the target language. The TM may assign a translation probability to the string in the target language.
-
Citations
18 Claims
-
1. A method for translating natural languages using a statistical translation system, the method comprising:
-
parsing a first string in a first language into a parse tree using a statistical parser included in the statistical machine translation system, the parse tree including a plurality of nodes, one or more of said nodes including one or more leafs, each leaf including a first word in the first language, the nodes including child nodes having labels; determining a plurality of possible reorderings of one or more of said child nodes including one or more of the leafs using the statistical translation system, the reordering performed in response to a probability corresponding to a sequence of the child node labels; determining a probability between 0.0000% and 100.0000%, non-inclusive, of the possible reorderings by the statistical translation system; determining a plurality of possible insertions of one or more words at one or more of said nodes using the statistical translation system; determining a probability between 0.0000% and 100.0000%, non-inclusive, of the possible insertions of one or more words at one or more of said nodes by the statistical translation system; translating the first word at each leaf into a second word corresponding to a possible translation in a second language using the statistical translation system; and determining a total probability between 0.0000% and 100.0000%, non-inclusive, based on the reordering, the inserting, and the translating by the statistical translation system. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. An apparatus for translating natural languages, the apparatus comprising:
-
a reordering module operative to determine a plurality of possible reorderings of nodes in a parse tree, said parse tree including the plurality of possible nodes, one or more of said nodes including a leaf having a first word in a first language, the parse tree including a plurality of parent nodes having labels, each parent node including one or more child nodes having a label, the reordering module including a reorder table having a reordering probability associated with reordering a first child node sequence into a second child node sequence; an insertion module operative to determine a plurality of possible insertions of an additional word at one or more of said nodes and to determine a probability between 0.0000% and 100.0000%, non-inclusive, of the possible insertions of of the additional word; a translation module operative to translate the first word at each leaf into a second word corresponding to a possible translation in a second language; a probability module to determine a plurality of possible reorderings of a probability between 0.0000% and 100.0000%, non-inclusive, of said plurality of possible reorderings of one or more of said nodes and to determine a total probability between 0.0000% and 100.0000%, non-inclusive, based on the reorder, the insertion, and the translation; and a statistical translation system operative to execute the reordering module, the insertion module, the translation module, and the probability module to effectuate functionalities attributed respectively thereto. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17)
-
-
18. An article comprising a non-transitory machine readable medium including machine-executable instructions, the instructions operative to cause a machine to:
-
parse a first string in a first language into a parse tree, the parse tree including a plurality of nodes, one or more of said nodes including one or more leafs, each leaf including a first word in the first language, the nodes including child nodes having labels; determine a plurality of possible reorderings of one or more of said child nodes including one or more of the leafs, the reordering performed in response to a probability corresponding to a sequence of the child node labels; determine a probability between 0.0000% and 100.0000%, non-inclusive, of the plurality of possible reorderings; determining a plurality of possible insertions of one or more words at one or more of said nodes; determine a probability between 0.0000% and 100.0000%, non-inclusive, of the plurality of possible insertions at one or more of said nodes; translate the first word at each leaf into a second word corresponding to a possible translation in a second language; and determine a total probability between 0.0000% and 100.0000%, non-inclusive, based on the reordering, the inserting, and the translating, wherein the nodes include child nodes having labels, and wherein said reordering comprises reordering one or more of said child nodes in response to a probability corresponding to a sequence of the child node labels.
-
Specification