Projecting dependencies to generate target language dependency structure
First Claim
Patent Images
1. A method for generating a target language dependency structure, the method comprising:
- accessing a training data corpus having a plurality of pairs of parallel text fragments, each pair of parallel text fragments comprising a source language text fragment and a corresponding target language text fragment;
obtaining an aligned structure in which lexical items in a source language dependency structure, generated based on a source language text fragment, are aligned with lexical items in a corresponding target language text fragment; and
projecting dependencies from lexical items in the source language dependency structure to lexical items in the target language text fragment to obtain the target language dependency structure, wherein the target language dependency structure comprises a target language dependency tree and wherein the dependency projection component is configured to re-adjust the target language dependency structure by identifying a node in the target language dependency tree that is out of order, and re-attaching the identified node at a lowest level in the target language dependency tree that yields a target language string with lexical items in the same order that they appear in the target language text fragment.
2 Assignments
0 Petitions
Accused Products
Abstract
In one embodiment of the present invention, a decoder receives a dependency tree as a source language input and accesses a set of statistical models that produce outputs combined in a log linear framework. The decoder also accesses a table of treelet translation pairs and returns a target dependency tree based on the source dependency tree, based on access to the table of treelet translation pairs, and based on the application of the statistical models.
-
Citations
30 Claims
-
1. A method for generating a target language dependency structure, the method comprising:
-
accessing a training data corpus having a plurality of pairs of parallel text fragments, each pair of parallel text fragments comprising a source language text fragment and a corresponding target language text fragment; obtaining an aligned structure in which lexical items in a source language dependency structure, generated based on a source language text fragment, are aligned with lexical items in a corresponding target language text fragment; and projecting dependencies from lexical items in the source language dependency structure to lexical items in the target language text fragment to obtain the target language dependency structure, wherein the target language dependency structure comprises a target language dependency tree and wherein the dependency projection component is configured to re-adjust the target language dependency structure by identifying a node in the target language dependency tree that is out of order, and re-attaching the identified node at a lowest level in the target language dependency tree that yields a target language string with lexical items in the same order that they appear in the target language text fragment. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A system for generating a target language dependency structure, the system comprising:
-
a corpus processing system configured to obtain an aligned structure in which lexical items in a source language dependency structure, indicative of a source language training data text fragment, are aligned with lexical items in a corresponding training data target language text fragment that is a translation of the source language text fragment; and a dependency projection component configured to project dependencies from lexical items in the source language dependency structure to the lexical items in the target language text fragment to obtain the target language dependency structure, wherein the target language dependency structure comprises a target language dependency tree and wherein the dependency projection component is configured to re-adjust the target language dependency structure by identifying a node in the target language dependency tree that is out of order, and re-attaching the identified node at a lowest level in the target language dependency tree that yields a target language string with lexical items in the same order that they appear in the target language text fragment. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
-
21. A computer readable medium storing computer readable instructions which, when executed by a computer, cause a computer to perform a method for generating a target language dependency structure, the method comprising:
-
accessing a training data corpus having a plurality of pairs of parallel text fragments, each pair of parallel text fragments comprising a source language text fragment and a corresponding target language text fragment; obtaining an aligned structure in which lexical items in a source language dependency structure, generated based on a source language text fragment, are aligned with lexical items in a corresponding target language text fragment; and projecting dependencies from lexical items in the source language dependency structure to the aligned lexical items in the target language text fragment to obtain the target language dependency structure, wherein a lexical item in the source language dependency structure is aligned with a plurality of lexical items in the target language text fragment, and wherein projecting comprises; identifying a parent node for the plurality of lexical items in the target language dependency structure; identifying a right-most one of the plurality of lexical items in the target language text fragment; assigning the right-most lexical item as dependent from the parent node; and assigning a remainder of the plurality of lexical items in the target language text fragment as dependent from the right-most lexical item; determining whether the target language dependency structure is properly indicative of the target language text fragment; and if not, adjusting the target language dependency structure so it is properly indicative of the target language text fragment. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29, 30)
-
Specification