Method for translating documents from one language into another using a database of translations, a terminology dictionary, a translation dictionary, and a machine translation system
First Claim
Patent Images
1. A method for translating a document in an input language into an output language, the method comprising:
- a) segmenting the document into fragments;
b) for each document fragment for which a translation is readily available, translating said document fragment based on said readily available translation; and
c) for each remaining untranslated document fragment for which a translation is not readily available, translating said untranslated document fragment based on a model-based machine translation technique by;
generating, using a computer processor, a graph of generalized constituents from a lexical-morphological structure of the untranslated document fragment by using one or more syntactical descriptions, semantic descriptions or lexical descriptions of the input language;
generating one or more syntactic trees from the graph of generalized constituents;
generating one or more rating scores for the one or more syntactic trees;
using linguistic descriptions of the input language or output language, and the one or more rating scores, building a language-independent semantic structure to represent the meaning of each untranslated document fragment; and
providing syntactically coherent output based on the respective language-independent semantic structure, wherein syntactically coherent output includes substitutions that agree both between themselves and with other words in the syntactically coherent output, morphologically and syntactically.
6 Assignments
0 Petitions
Accused Products
Abstract
In one embodiment, the invention provides a method for translating a document in an input language into an output language comprising: a) for each document fragment for which a translation is readily available, translating said document fragment based on said readily available translation; and b) for each remaining untranslated fragment for which a translation is not readily available, translating said untranslated fragment based on a model-based machine translation technique. A translation is readily available if a search reveals at least one matching translation for the document fragment in a translation database.
136 Citations
39 Claims
-
1. A method for translating a document in an input language into an output language, the method comprising:
-
a) segmenting the document into fragments; b) for each document fragment for which a translation is readily available, translating said document fragment based on said readily available translation; and c) for each remaining untranslated document fragment for which a translation is not readily available, translating said untranslated document fragment based on a model-based machine translation technique by; generating, using a computer processor, a graph of generalized constituents from a lexical-morphological structure of the untranslated document fragment by using one or more syntactical descriptions, semantic descriptions or lexical descriptions of the input language; generating one or more syntactic trees from the graph of generalized constituents; generating one or more rating scores for the one or more syntactic trees; using linguistic descriptions of the input language or output language, and the one or more rating scores, building a language-independent semantic structure to represent the meaning of each untranslated document fragment; and providing syntactically coherent output based on the respective language-independent semantic structure, wherein syntactically coherent output includes substitutions that agree both between themselves and with other words in the syntactically coherent output, morphologically and syntactically. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
-
-
24. A system comprising:
-
a processor; and a memory coupled to the processor, the memory comprising a translation application which when executed causes the system to perform a set of instructions for translating a document in an input language into an output language, the instructions comprising; a) segmenting the document into document fragments; b) for each document fragment for which a translation is readily available, translating said document fragment based on said readily available translation; and c) for each remaining untranslated document fragment for which a translation is not readily available, translating said untranslated document fragment based on a model-based machine translation technique by; generating a graph of generalized constituents from a lexical-morphological structure of the untranslated document fragment by using one or more syntactical descriptions, semantic descriptions or lexical descriptions of the input language; generating one or more syntactic trees from the graph of generalized constituents; generating one or more rating scores for the one or more syntactic trees; using linguistic descriptions of the input language or output language, and the one or more rating scores, building a language-independent semantic structure to represent the meaning of each untranslated document fragment; and providing syntactically coherent output based at least in part upon the respective language-independent semantic structure, wherein syntactically coherent output includes substitutions that agree both between themselves and with other words in the syntactically coherent output, morphologically and syntactically. - View Dependent Claims (25, 26, 27, 28, 30, 32)
-
- 29. The system of system 28, wherein the search is an exact search or each matching translation corresponds exactly to the document fragment.
-
37. A physical, non-transitory computer storage medium having stored thereon a program which when executed by a processor, perform instructions for translating a document in an input language into an output language, the instructions comprising:
-
a) segmenting the document into document fragments; b) for each document fragment for which a translation is readily available, translating said document fragment based on said readily available translation; and c) for each remaining document untranslated fragment for which a translation is not readily available, translating said untranslated document fragment based on a model-based machine translation technique by; generating a graph of generalized constituents from a lexical-morphological structure of the untranslated document fragment by using one or more syntactical descriptions, semantic descriptions or lexical descriptions of the input language; generating one or more syntactic trees from the graph of generalized constituents; generating one or more rating scores for the one or more syntactic trees; using linguistic descriptions of the input language or output language, and the one or more rating scores, building a language-independent semantic structure to represent the meaning of each untranslated document fragment; and providing syntactically coherent output based at least in part upon the respective language-independent semantic structure. - View Dependent Claims (38, 39)
-
Specification