Method for constructing a knowledge base, knowledge base system, machine translation method and system therefor
First Claim
1. A method for constructing a knowledge base for responding to an input source-language word pattern that meets a translation pattern description and outputting a translation in a target language, from translation cases stored in a case base, by using a thesaurus describing hierarchies of words and concepts in the source language, comprising the steps of:
- (a) searching the case base to find one or more translation cases using a selected translation pattern description, the selected translation pattern description having a source language pattern with source values, an associated target language pattern with target values, and links that relate respective source and target values, the translation pattern description being selected when the input-source language word pattern matches the source language pattern;
(b) extracting one or more matching translation cases having words in the source language that meet one or more translation pattern description conditions and extracting a respective corresponding translation for the extracted translation cases, the extracted matching translation cases and corresponding translations being a set of translation pattern cases, the extracted matching translation cases having the words in the source language as source values and the corresponding translations having a target value determined by a translation of their respective source values;
(c) generating a partial thesaurus for one or more translation pattern cases in the set of translation pattern cases, the partial thesaurus created by extracting parts of a thesaurus that contain the source values and hypernyms of the source values and obtaining for each of said source values and hypernyms a corresponding translation, the source value and corresponding translation comprising a pair, the pair and a corresponding frequency of occurrence of the pair comprising a word node in a hierarchy of word nodes in the partial thesaurus;
(d) computing an importance value of a translation for each word node contained in said partial thesaurus on the basis of said frequency of the word node; and
(e) determining whether to convert a word contained in a translation pattern case into a hypernym in said partial thesaurus by using the importance value of a corresponding translation computed for the hypernym, and, if possible, converting the word into the hypernym in order to generalize said translation pattern case.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for machine translation of language is disclosed. For each translation pattern, qualified words and corresponding translations are extracted. The invention generalizes words by converting them into concepts which are more general and can be applied to many inputs. A partial thesaurus is generated that describes hierarchies of words and concepts that are hypernyms of these words. Based on frequency information obtained for the words, an importance value is computed for each concept. Importances are used to determine the possibility of generalizing concepts in the translation.
-
Citations
13 Claims
-
1. A method for constructing a knowledge base for responding to an input source-language word pattern that meets a translation pattern description and outputting a translation in a target language, from translation cases stored in a case base, by using a thesaurus describing hierarchies of words and concepts in the source language, comprising the steps of:
-
(a) searching the case base to find one or more translation cases using a selected translation pattern description, the selected translation pattern description having a source language pattern with source values, an associated target language pattern with target values, and links that relate respective source and target values, the translation pattern description being selected when the input-source language word pattern matches the source language pattern; (b) extracting one or more matching translation cases having words in the source language that meet one or more translation pattern description conditions and extracting a respective corresponding translation for the extracted translation cases, the extracted matching translation cases and corresponding translations being a set of translation pattern cases, the extracted matching translation cases having the words in the source language as source values and the corresponding translations having a target value determined by a translation of their respective source values; (c) generating a partial thesaurus for one or more translation pattern cases in the set of translation pattern cases, the partial thesaurus created by extracting parts of a thesaurus that contain the source values and hypernyms of the source values and obtaining for each of said source values and hypernyms a corresponding translation, the source value and corresponding translation comprising a pair, the pair and a corresponding frequency of occurrence of the pair comprising a word node in a hierarchy of word nodes in the partial thesaurus; (d) computing an importance value of a translation for each word node contained in said partial thesaurus on the basis of said frequency of the word node; and (e) determining whether to convert a word contained in a translation pattern case into a hypernym in said partial thesaurus by using the importance value of a corresponding translation computed for the hypernym, and, if possible, converting the word into the hypernym in order to generalize said translation pattern case. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A translation knowledge base system comprising means for accumulating generalized cases for outputting translations into a target language in response to the input of a source-language word pattern comprising:
-
(a) a translation pattern description means for searching a case base to find one or more translation cases that match the translation pattern description means, the translation pattern description means having a source language pattern with source values, and associated target language pattern with target values, and links that relate respective source and target values; (b) a set of translation pattern cases extracted by matching translation cases having words in the source language that meet one or more translation pattern description conditions and having a corresponding translation for the extracted translation cases; (c) a partial thesaurus for each translation pattern case in the set of translation pattern cases, the partial thesaurus representing the hierarchies of words contained in the translation pattern cases and concepts that are hypernyms of said words in said thesaurus, the partial thesaurus also having information for each of the words being a corresponding translation, the word and corresponding translation comprising a pair, the pair and a corresponding frequency of occurrence of the pair comprising a words node in a hierarchy of word nodes in the partial thesaurus; (d) an importance value of a translation for each concept contained in said partial thesaurus computed from the frequency of the word node; and (e) a means for determining whether to convert a word contained in a pattern case into a hypernym in said partial thesaurus by using the importance value of a corresponding translation computed for the hypernym, and, if possible, converting the word into the hypernym in order to generalize the pattern case.
-
-
11. A method for machine translation from an input expressed in a source language into an output expressed in a target language by using a knowledge base for responding to an input source language word pattern comprising the steps of:
-
(a) searching the knowledge base to find one or more translation cases using a selected translation pattern description, the selected translation pattern description having a source language pattern with source values, an associated target language pattern with target values, and links that relate respective source and target values, the translation pattern description being selected when the input source language word pattern matches the source language pattern; (b) extracting one or more matching translation cases having words in the source language that meet one or more translation pattern description conditions and extracting a respective corresponding translation for the extracted translation cases, the extracted matching translation cases and corresponding translations being a set of translation pattern cases, the extracted matching translation cases having the words in the source language as source values and the corresponding translations having a target value determined by a translation of their respective source values; (c) generating a partial thesaurus for one or more translation pattern cases in the set of translation pattern cases, the partial thesaurus created by extracting parts of a thesaurus that contain the source values and hypernyms of the source values and obtaining for each of said source values and hypernyms a corresponding translation, the source value and corresponding translation comprising a pair, the pair and a corresponding frequency of occurrence of the pair comprising a word node in a hierarchy of word nodes in the partial thesaurus; (d) computing an importance value of a translation for each word node contained in said partial thesaurus on the basis of said frequency of the word node and using the importance value to determine whether to generalize the translation pattern case by converting a word to a hypernym; (e) obtaining a target value translation by using the importance values of the word nodes; and (f) using the obtained target value in a target language pattern as the translation of the source language input. - View Dependent Claims (12, 13)
-
Specification