Apparatus for and method of selecting a target language equivalent of a predicate word in a source language word string in a machine translation system
First Claim
1. An apparatus for selecting a target language equivalent of a predicate word in a source language word string for use in a machine translation system, said source language word string including the predicate word and an associated non-predicate word, the apparatus comprising:
- memory means for storing therein a plurality of records, each record including a first data corresponding to an entry word in a source language which is predicate or non-predicate, a second data corresponding to at least one target language word which is predicate or non-predicate and is equivalent to said entry word, and one of a third data, in the form of a first set of numerical values, corresponding to semantic features of a plurality of non-predicate words related to at least one case governed by at least one predicate target language word equivalent to said entry word for said entry word being predicate and a fourth data, in the form of a second set of numerical values, corresponding to a semantic feature of at least one non-predicate target language word equivalent to said entry word for said entry word being non-predicate; and
,processor means coupled to said memory means, the processor means includingmeans for fetching therefrom the third data of said plurality of non-predicate words serving as arguments for said at least one case governed by the at least one predicate target language word equivalent to the predicate word in said source language word string and the fourth data of one of the non-predicate target language words which is equivalent to the associated non-predicate word in said source language word string,means for carrying out numerical operations between said fetched third data and said fetched fourth data to provide a plurality of operation results,means for selecting one of said operation results according to predetermined criteria, andmeans for determining said target language equivalent of said source language predicate word in said source language word string by determining which of the at least one predicate target language words has a corresponding third data providing said selected operation result.
1 Assignment
0 Petitions
Accused Products
Abstract
An apparatus for and a method of selecting a target language equivalent of a predicate word in a source language word string for use in a machine translation system in which use is made of a dictionary having records, each including data on an entry word of a predicate source language word, on predicate target language words equivalent to the entry source language word and on semantic features of non-predicate words related to a case governed by the predicate target language words or including data on an entry word of a non-predicate source language word, on a non-predicate target language word equivalent to the entry source language word and on semantic features of the non-predicate target language word. A processor is coupled to the dictionary for fetching therefrom the semantic feature data of the non-predicate words serving as arguments for the case governed by the predicate target language words equivalent to the predicate word in the source language word string and the semantic feature data of one of the non-predicate target language words which is equivalent to the non-predicate word in the source language word string, carrying out numerical operations between the fetched data to provide a plurality of operation results, and selecting one of the operation results according to predetermined criteria and determining that one of the predicate target language words which has the data of the non-predicate words providing the selected operation result as the target language equivalent of the source language predicate word.
-
Citations
23 Claims
-
1. An apparatus for selecting a target language equivalent of a predicate word in a source language word string for use in a machine translation system, said source language word string including the predicate word and an associated non-predicate word, the apparatus comprising:
-
memory means for storing therein a plurality of records, each record including a first data corresponding to an entry word in a source language which is predicate or non-predicate, a second data corresponding to at least one target language word which is predicate or non-predicate and is equivalent to said entry word, and one of a third data, in the form of a first set of numerical values, corresponding to semantic features of a plurality of non-predicate words related to at least one case governed by at least one predicate target language word equivalent to said entry word for said entry word being predicate and a fourth data, in the form of a second set of numerical values, corresponding to a semantic feature of at least one non-predicate target language word equivalent to said entry word for said entry word being non-predicate; and
,processor means coupled to said memory means, the processor means including means for fetching therefrom the third data of said plurality of non-predicate words serving as arguments for said at least one case governed by the at least one predicate target language word equivalent to the predicate word in said source language word string and the fourth data of one of the non-predicate target language words which is equivalent to the associated non-predicate word in said source language word string, means for carrying out numerical operations between said fetched third data and said fetched fourth data to provide a plurality of operation results, means for selecting one of said operation results according to predetermined criteria, and means for determining said target language equivalent of said source language predicate word in said source language word string by determining which of the at least one predicate target language words has a corresponding third data providing said selected operation result. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A machine translation system for translating a source language word string to a corresponding target language word string, the source language word string including a predicate word and an associated non-predicate word, the system comprising:
-
input means for inputting a source language word string; a first dictionary for storing therein a plurality of first records, each first record including a first data corresponding to an entry word in a source language which is predicate or non-predicate, a second data corresponding to at least one target language word which is predicate or non-predicate and is equivalent to said entry word, and one of a third data, in the form of a first set of numerical values, corresponding to semantic features of a plurality of non-predicate words related to at least one case governed by at least one predicate target language word, equivalent to said entry word for said entry word being predicate and a fourth data, in the form of a second set of numerical values, corresponding to a semantic feature of at least one non-predicate target word equivalent to said entry word for said entry word being non-predicate; a second dictionary for storing therein a plurality of second records, each second record including a sixth data corresponding to an entry word in the source language which is predicate, a seventh data corresponding to an associated non-predicate word in the source language in a co-occurrence relation with said sixth data entry word, an eighth data corresponding to a predicate target language word equivalent to said sixth data entry word and a ninth data corresponding to a non-predicate target language word equivalent to said seventh data non-predicate word; processing means coupled to said input means and said first and second dictionaries for reading out data therefrom to provide a target language word string, said processing means including means for fetching from said first dictionary the third data of said plurality of non-predicate words serving as arguments for said case governed by the at least one predicate target language word equivalent to the predicate word in said source language word string and the fourth data of one of the non-predicate target language words which is equivalent to the associated non-predicate word in said source language word string and for carrying out numerical operations between said fetched third data and said fetched fourth data to provide a plurality of operation results, means for selecting one of said operation results according to predetermined criteria and for determining which of the at least one predicate target language words has a corresponding third data providing said selected operation result thereby providing said target language equivalent of said source language predicate word in said source language word string, and, means for fetching from said second dictionary said eighth data in that one of said second records in which said predicate and non-predicate words in said source language word string hit the sixth and seventh data, respectively, so that said eighth data target language word equivalent to the sixth data entry word in said second record is selected as said target language equivalent of said source language predicate word in preference to said second data predicate target language words; and
,output means coupled to said processing means for outputting the target language word string.
-
-
12. A method of selecting an equivalent of a predicate word in a source language word string by the use of a dictionary in machine translation, the dictionary being operatively coupled with a processor, said source language word string including the predicate word and an associated non-predicate word, said dictionary having a plurality of records, each record including a first data corresponding to an entry word in a source language which is predicate or non-predicate, a second data corresponding to at least one target language word which is predicate or non-predicate and is equivalent to said entry word, and one of a third data, in the form of a first set of numerical values, corresponding to semantic features of a plurality of non-predicate words related to at least one case governed by at least one predicate target language word equivalent to said entry word for said entry word being predicate and a fourth data, in the form of a second set of numerical values, corresponding to a semantic feature of at least one non-predicate target language word equivalent to said entry word for said entry word being non-predicate, the method comprising steps of:
-
finding by the processor, in said dictionary, the third data of the plurality of non-predicate words in one of said records which includes, for the first data entry word, the predicate word in said source language word string; finding by the processor, in said dictionary, the fourth data of the non-predicate target language word in one of said records which includes, for the first data entry word, the associated non-predicate word in said source language word string; carrying out, by the processor, numerical operations between the numerical values of said found third and fourth data to provide a plurality of operation results; selecting, by the processor, one of said operation results according to predetermined criteria; and
,determining, by the processor, as a target language equivalent of said predicate word in said source language word string, one of said predicate target language words equivalent to said predicate source language word which has a corresponding third data providing said selected operation result. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19)
-
-
20. A method of selecting an equivalent of a predicate word in a source language word string by the use of first and second dictionaries in machine translation, each of the first and second dictionaries operatively coupled with a processor, said source language word string including the predicate word and an associated non-predicate word, said first dictionary having a plurality of first records, each first record including a first data corresponding to an entry word in a source language which is predicate or non-predicate, a second data corresponding to at least one target language word which is predicate or non-predicate and is equivalent to said entry word, and one of a third data, in the form of a first set of numerical values, corresponding to semantic features of a plurality of non-predicate words related to at least one case governed by at least one predicate target language word equivalent to said entry word for said entry word being predicate and a fourth data, in the form of a second set of numerical values, corresponding to a semantic feature of at least one non-predicate target language word equivalent to said entry word for said entry word being non-predicate, said second dictionary having a plurality of second records, each second record including a fifth data corresponding to an entry word in the source language which is predicate, a sixth data corresponding to an associated non-predicate word in the source language in a co-occurrence relation with said fifth data entry word, a seventh data corresponding to a predicate target language word equivalent to said fifth data entry word and an eighth data corresponding to a non-predicate target language word equivalent to said sixth data non-predicate word, the method comprising the steps of:
-
searching, by the processor, said first dictionary for a one first record which includes a first data corresponding to said predicate source language word and for another first record that includes a first data corresponding to said non-predicate source language word; searching, by the processor, said second dictionary for a one second record which includes a fifth data corresponding to said predicate source language word and a sixth data corresponding to said associated non-predicate source language word; if said second record is found in said dictionary in said second searching step, determining, by the processor, as said target language equivalent of said predicate word in said source language word string, the predicate target language word defined by the seventh data in said found second record in said second dictionary; if said second record is not found in said dictionary in said second searching step, carrying out, by the processor, numerical operations between the numerical values of the third data in said searched first record including a first data corresponding to said predicate source language word and the numerical values of said fourth data in said searched first record including a first data corresponding to said non-predicate source language word to provide a plurality of operation results; selecting, by the processor, one of said operation results according to predetermined criteria; and
,determining, by the processor, as said target language equivalent of said predicate word in said source language word string, one of said predicate target language words equivalent to said predicate source language word which has a corresponding third data providing said selected operation result.
-
-
21. A method of selecting an equivalent of a predicate word in a source language word string by the use of first and second dictionaries in machine translation, said first and second dictionaries being operatively coupled with a processor, said source language word string including the predicate word and an associated non-predicate word, said first dictionary having a plurality of first records, each first record including a first data corresponding to an entry word in a source language which is predicate or non-predicate, a second data corresponding to at least one target language word which is predicate or non-predicate and is equivalent to said entry word, and one of a third data, in the form of a first set of numerical values, corresponding to semantic features of a plurality of non-predicate words related to at least one case governed by at least one predicate target language word equivalent to said entry word for said entry word being predicate, and a fourth data, in the form of a second set of numerical values, corresponding to a semantic feature of at least one non-predicate target language word equivalent to said entry word for said entry word being non-predicate and a fifth data representative of weights, each weight corresponding to one of said plurality of cases governed by said at least one predicate target language word equivalent to the predicate word in said source language word string, said second dictionary having a plurality of second records, each second record including a sixth data corresponding to an entry word in the source language which is predicate, a seventh data corresponding to an associated non-predicate word in the source language in a co-occurrence relation with said sixth data entry word, an eighth data corresponding to a predicate target language word equivalent to said sixth data entry word, a ninth data corresponding to a non-predicate target language word equivalent to said seventh data non-predicate word and a tenth data corresponding to a case of said seventh data non-predicate word, each of those records in said first dictionary in which said first data entry words are predicate further including an eleventh data representative of weights, each weight corresponding to the case of said seventh data non-predicate word in one of said second records in said second dictionary, the method comprising the steps of:
-
searching said first dictionary, by the processor, for a one first record which includes a first data corresponding to said predicate source language word and for another first record which includes a first data corresponding to said non-predicate source language word; searching said second dictionary, by the processor, for second records each of which includes a fifth data corresponding to said predicate source language word, a sixth data corresponding to said non-predicate source language word and a tenth data corresponding to the case of said non-predicate source language word; carrying out, by the processor, first numerical operations between the numerical values of the third data in said searched first record and the numerical values of said fourth data in said searched first record to provide a plurality of first operation results; carrying out, by the processor, second numerical operations, for each predicate target language word defined by the second data in said searched first record, between said first operation results and the weights of said fifth data in said searched first record and third numerical operations between said weights of said eleventh data in said searched first record and a numerical representation of the tenth data in said searched second record; obtaining, by the processor, for said each predicate target language word defined by the second data in said searched first record, a sum of said second numerical operations result and said third numerical operations result; selecting, by the processor, one of said sums according to predetermined criteria; and
,determining, by the processor, as said target language equivalent of said predicate word in said source language word string, one of said predicate target language words equivalent to said predicate source language word which has a corresponding third data providing said selected operation result.
-
-
22. A method for selecting a target language equivalent of a source language word string, the source language word string comprising a source predicate word and a source non-predicate word, the method being useful in a system comprising a first dictionary including combinations of selected source predicate and non-predicate words and co-occurrence target equivalents thereof, a second dictionary comprising at least one first record and at least one second record, the first record comprising the source predicate word, at least one target predicate word and at least one first semantic feature vector, each first semantic feature vector corresponding to the target predicate words and indicating a nature of each of the target predicate words, the second record comprising the source non-predicate word, a corresponding target non-predicate word, and a corresponding second semantic feature vector indicating a nature of the target non-predicate word, the method performed by a processor, the method comprising the steps of:
-
inputting the source language word string comprising the source predicate and non-predicate words; searching the combinations of the first dictionary to determine if a combination matches the source predicate and non-predicate words; selecting the co-occurrence target equivalents corresponding to the combination as the target language equivalent if a match occurs; searching the second dictionary to obtain the first and second records respectively comprising the source predicate word and the source non-predicate word if no match occurs; calculating respective inner products of the second semantic feature vector included in the second record with the each first semantic feature vectors included in the first record; determining which of the at least one target predicate words corresponds to a larger inner product; and
,selecting that target predicate word as the target language equivalent.
-
-
23. An apparatus for selecting a target language equivalent of a source language word string, the source language word string comprising a source predicate word and a source non-predicate word, the apparatus useful in a system comprising a first dictionary including combinations of selected source predicate and non-predicate words and co-occurrence target equivalents thereof, a second dictionary comprising at least one first record and at least one second record, the first record comprising the source predicate word, at least one target predicate word and at least one first semantic feature vector, each first semantic feature vector corresponding to the target predicate words and indicating a nature of each of the target predicate words, the second record comprising the source non-predicate word, a corresponding target non-predicate word, and a corresponding second semantic feature vector indicating a nature of the target non-predicate word, the apparatus comprising:
-
means for inputting the source language word string comprising the source predicate and non-predicate words; means for searching the combinations of the first dictionary to determine if a combination matches the source predicate and non-predicate words; means for selecting the co-occurrence target equivalents corresponding to the combination as the target language equivalent if a match occurs; means for searching the second dictionary to obtain the first and second records respectively comprising the source predicate word and the source non-predicate word if no match occurs; means for calculating respective inner products of the second semantic feature vector including in the second record with the each first semantic feature vectors included in the first record; means for determining which of the at least one target predicate words corresponds to a largest inner product; and
,means for selecting that target predicate word as the target language equivalent.
-
Specification