Statistical machine translation processing
First Claim
1. Instructions on a computer-usable medium wherein the instructions when executed cause a computer system to perform a method of statistical machine translation (SMT), said method comprising:
- receiving a word string in a first natural language;
parsing said word string into a parse tree comprising a plurality of child nodes;
reordering said plurality of child nodes to provide a plurality of reordered word strings;
evaluating each of said plurality of reordered word strings using a reordering knowledge, wherein said reordering knowledge is based on a syntax of said first natural language; and
translating a plurality of preferred reordered word strings from said plurality of reordered word strings to a second natural language based on said evaluating; and
selecting a statistically preferred translation of said word string from among translations of said plurality of preferred reordered word strings.
2 Assignments
0 Petitions
Accused Products
Abstract
A method of statistical machine translation (SMT) is provided. The method comprises generating reordering knowledge based on the syntax of a source language (SL) and a number of alignment matrices that map sample SL sentences with sample target language (TL) sentences. The method further comprises receiving a SL word string and parsing the SL word string into a parse tree that represents the syntactic properties of the SL word string. The nodes on the parse tree are reordered based on the generated reordering knowledge in order to provide reordered word strings. The method further comprises translating a number of reordered word strings to create a number of TL word strings, and identifying a statistically preferred TL word string as a preferred translation of the SL word string.
-
Citations
20 Claims
-
1. Instructions on a computer-usable medium wherein the instructions when executed cause a computer system to perform a method of statistical machine translation (SMT), said method comprising:
-
receiving a word string in a first natural language; parsing said word string into a parse tree comprising a plurality of child nodes; reordering said plurality of child nodes to provide a plurality of reordered word strings; evaluating each of said plurality of reordered word strings using a reordering knowledge, wherein said reordering knowledge is based on a syntax of said first natural language; and translating a plurality of preferred reordered word strings from said plurality of reordered word strings to a second natural language based on said evaluating; and selecting a statistically preferred translation of said word string from among translations of said plurality of preferred reordered word strings. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A statistical machine translation (SMT) system comprising:
-
a parsing module configured to receive a word string in a first natural language and parse said word string into a parse tree comprising a plurality of child nodes; a preprocessing module coupled with said parsing module, said preprocessing module configured to access said plurality of child nodes and reorder words from said word string based on a syntax of said first natural language to provide a plurality of reordered word strings; and a decoding module coupled with said preprocessing module, said decoding module configured to access said plurality of reordered word strings, identify a statistically preferred reordered word string based on reordering probabilities associated with said plurality of reordered word strings, and generate a target word string based on a word sequence of said statistically preferred reordered word string. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A language reordering system for use in statistical machine translation (SMT), said language reordering system comprising:
-
a training database for storing training data comprising sentences in a first natural language paired with sentences in a second natural language; an alignment model configured to match words and phrases in said first natural language to words and phrases in said second natural language, said alignment model utilizing said training data to generate training samples identifying syntactic differences between said first natural language and said second natural language; and a preprocessing module coupled with said training database and said alignment model, said preprocessing module configured to create a body of reordering knowledge based on said training samples. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification