Training for a text-to-text application which uses string to tree conversion for training and decoding
First Claim
Patent Images
1. A method comprising:
- using information that is based on corpora of string-based training information to create a plurality of rules that are based on the training information, and where the rules include parts of trees as parts of the rules; and
using said rules including said parts of trees for a text to text application.
1 Assignment
0 Petitions
Accused Products
Abstract
Training and translation using trees and/or subtrees as parts of the rules. A target language is word aligned with a source language, and at least one of the languages is parsed into trees. The trees are used for training, by aligning conversion steps, forming a manual set of information representing the conversion steps and then learning rules from that reduced set. The rules include subtrees as parts thereof, and are used for decoding, along with an n-gram language model and a syntax based language mode.
169 Citations
43 Claims
-
1. A method comprising:
-
using information that is based on corpora of string-based training information to create a plurality of rules that are based on the training information, and where the rules include parts of trees as parts of the rules; and
using said rules including said parts of trees for a text to text application. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A method, comprising:
-
aligning items of information in first and second different languages to form aligned information, where at least said information in said first language is in a tree form; and
extracting rules from said aligned information. - View Dependent Claims (22, 23, 24)
-
-
25. A method, comprising:
-
obtaining a string in a source language to be translated into a target language;
using at least one rule set which includes both rules that include at least parts of subtrees and also include probabilities, along with at least an ngram language model and a syntax based language model, to translate said string into said target language. - View Dependent Claims (26, 27, 28)
-
-
29. A system comprising:
-
a training part, receiving a corpora of string-based training information to create a plurality of rules that are based on the training information, and where the rules include parts of trees as components of the rules; and
a text to text application portion, using said rules including said parts of trees for a text to text application. - View Dependent Claims (30, 31, 32, 33, 34, 35)
-
-
36. A system, comprising:
A training part, aligning a items of information in first and second different languages to form aligned information, where at least said information in said first language is in a tree form, and extracting rules from said aligned information. - View Dependent Claims (37, 38, 39)
-
40. A system, comprising
a memory, storing at least one rule set which includes both rules that include at least parts of subtrees and also include probabilities, and a decoding part, obtaining a string in a source language to be translated into a target language, and receiving said at least one rule set and using said at least one rule set along with at least both of an ngram language model and a syntax based language model, to translate said string into said target language.
Specification