Training for a text-to-text application which uses string to tree conversion for training and decoding
First Claim
Patent Images
1. A computer implemented method, comprising:
- executing, by a processor, instructions stored in memory to use information that is based on corpora of string-based training information to create a plurality of rules that are based on the training information; and
performing source language string to target language tree translation using an n-gram language model, a syntax-based language model, and the plurality of rules for an executable text to text application stored in memory, the plurality of rules including syntactic translation rules that are each associated with a probability for a translation, wherein a syntactic translation rule is determined by analyzing an alignment graph that includes a source string, a target tree, and an alignment of the source string and the target tree.
1 Assignment
0 Petitions
Accused Products
Abstract
Training and translation using trees and/or subtrees as parts of the rules. A target language is word aligned with a source language, and at least one of the languages is parsed into trees. The trees are used for training, by aligning conversion steps, forming a manual set of information representing the conversion steps and then learning rules from that reduced set. The rules include subtrees as parts thereof, and are used for decoding, along with an n-gram language model and a syntax based language mode.
402 Citations
40 Claims
-
1. A computer implemented method, comprising:
-
executing, by a processor, instructions stored in memory to use information that is based on corpora of string-based training information to create a plurality of rules that are based on the training information; and performing source language string to target language tree translation using an n-gram language model, a syntax-based language model, and the plurality of rules for an executable text to text application stored in memory, the plurality of rules including syntactic translation rules that are each associated with a probability for a translation, wherein a syntactic translation rule is determined by analyzing an alignment graph that includes a source string, a target tree, and an alignment of the source string and the target tree. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A computer implemented method, comprising:
-
executing, by a processor, instructions stored in memory to align items of information in first and second different languages to form aligned information, wherein at least the information in the first language is in a tree form; and executing, by a processor, instructions stored in memory to extract rules from the aligned information, the rules utilizable in conjunction with an n-gram model and a syntax based language model, the rules configured for use with performing source language string to target language tree translation using an n-gram language model, a syntax-based language model, and the plurality of rules for an executable text to text application stored in memory, the plurality of rules including syntactic translation rules that are each associated with a probability for a translation, wherein a syntactic translation rule is determined by analyzing an alignment graph that includes a source string, a target tree, and an alignment of the source string and the target tree. - View Dependent Claims (20, 21, 22)
-
-
23. A computer implemented method, comprising:
-
obtaining a string in a source language to be translated into a target language; and executing, by a processor, instructions stored in memory to translate the string into the target language using at least one rule set, an n-gram language model, and a syntax based language model, wherein the at least one rule set comprises both rules that include at least parts of subtrees and probabilities, a rule set including translation rules in a subtree to substring rule form for a machine translation, the translation rules being associated with probabilities for the rules, wherein a translation rule is determined by analyzing an alignment graph that includes a source string, a target tree, and an alignment of the source string and the target tree. - View Dependent Claims (24, 25, 26)
-
-
27. A system comprising:
-
a training part executable by a processor and stored in memory, the training part receiving a corpora of string-based training information to create a plurality of rules that are based on the training information, the rules including parts of trees as components of the rules; and a text to text application portion that uses an n-gram language model, a syntax-based language model, and the rules for a text to text application performing source language string to target language tree translation, the rules including translation rules in a subtree to substring rule form for a machine translation, the translation rules being associated with probabilities for the rules, wherein a translation rule is determined by analyzing an alignment graph that includes a source string, a target tree, and an alignment of the source string and the target tree. - View Dependent Claims (28, 29, 30, 31, 32)
-
-
33. A system, comprising:
-
a training module, executable by a processor and stored in a memory, that aligns items of information in first and second different languages to form aligned information and extracts rules from the aligned information, wherein at least the information in the first language is in a tree form, and the rules are utilizable in conjunction with an n-gram model and a syntax based language model, the tree form utilized in a source language string to target language tree translation, the rules including translation rules in a subtree to substring rule form for a machine translation, the translation rules being associated with probabilities for the rules, wherein a translation rule is determined by analyzing an alignment graph that includes a source string, a target tree, and an alignment of the source string and the target tree. - View Dependent Claims (34, 35, 36)
-
-
37. A system, comprising:
-
a memory that stores at least one rule set that comprises both rules that include at least parts of subtrees and probabilities; and a decoding part that obtains a string in a source language to be translated into a target language, receives the at least one rule set, and uses the at least one rule set, an n-gram language model, and a syntax based language model to translate the string into the target language, the decoding part performing source language string to target language tree translation, a rule set including translation rules in a subtree to substring rule form for a machine translation, the translation rules being associated with probabilities for the rules, wherein a translation rule is determined by analyzing an alignment graph that includes a source string, a target tree, and an alignment of the source string and the target tree. - View Dependent Claims (38, 39, 40)
-
Specification