Multiple-parts-of-speech disambiguating method and apparatus for machine translation system
First Claim
1. A part-of-speech disambiguating apparatus for a machine translation system, comprising:
- input means for inputting a sentence constituted by words to be disambiguated with respect to the parts of speech which the words function as, respectively;
a dictionary memory for storing a number of words belonging to a language used in said sentence together with respective parts of speech and appearance frequencies thereof;
a rule table memory for storing a table containing parts-of-speech disambiguating rules for specifying the parts of speech of the words each of which can function as multiple parts of speech from an array of parts of speech of successive words in which said words whose parts of speech are to be specified are included; and
a processor connected to said input means, said dictionary memory and said rule table memory, wherein said processor receives said sentence from said input means and determines the parts of speech of the words which are contained in said sentence and capable of functioning as multiple parts of speech, on the basis of data read out from said rule table memory, while determining the parts of speech on the basis of said appearance frequencies read out from said dictionary memory for those words whose parts of speech could not been determined on the basis of the data read out from said rule table memory.
1 Assignment
0 Petitions
Accused Products
Abstract
A machine translation system comprises input means for inputting a sentence written in a natural language, processor for parsing the input sentence, a word dictionary memory referred to by the processor, and a memory for storing multiple-parts-of-speech disambiguating rules in the form of a table. The parts of speech of words capable of functioning as multiple parts of speech should be in the inputted sentence are determined in consideration of an array of the parts of speech by applying the multiple-parts-of-speech disambiguating rules. Additionally, rate of appearance of each part of speech which the word of the input sentence can function as is previously calculated, and the part of speech which can not be determined by consulting the disambiguating rule table is determined in dependence on whether the rate of appearance exceeds a predetermined threshold value.
-
Citations
15 Claims
-
1. A part-of-speech disambiguating apparatus for a machine translation system, comprising:
-
input means for inputting a sentence constituted by words to be disambiguated with respect to the parts of speech which the words function as, respectively; a dictionary memory for storing a number of words belonging to a language used in said sentence together with respective parts of speech and appearance frequencies thereof; a rule table memory for storing a table containing parts-of-speech disambiguating rules for specifying the parts of speech of the words each of which can function as multiple parts of speech from an array of parts of speech of successive words in which said words whose parts of speech are to be specified are included; and a processor connected to said input means, said dictionary memory and said rule table memory, wherein said processor receives said sentence from said input means and determines the parts of speech of the words which are contained in said sentence and capable of functioning as multiple parts of speech, on the basis of data read out from said rule table memory, while determining the parts of speech on the basis of said appearance frequencies read out from said dictionary memory for those words whose parts of speech could not been determined on the basis of the data read out from said rule table memory. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method of disambiguating a part of speech which a word capable of functioning as multiple parts of speech should be, in a machine translation system comprising steps of:
-
(a) preparing a dictionary memory for storing a number of words of a natural language together with parts of speech which said words can function as and frequencies of appearance of said parts of speech; (b) preparing a rule table memory for storing a table containing part-of-speech disambiguating rules for disambiguating a part of speech of a word which is included in an array of parts of speech of successive words and which can function as multiple parts of speech; (c) inputting a sentence written in said natural language and to be disambiguated in respect to the parts of speech which the words of said sentence should function as; (d) determining parts of speech which the words each capable of functioning as multiple parts of speech should be in said sentence, on the basis of data read out from said rule table memory; (e) determining parts of speech which the words capable of functioning as multiple parts of speech should be in said sentence, on the basis of said appearance frequency of the parts of speech which said words can function as and which is read from said dictionary memory, when said parts of speech could not be determined at said step (d); (f) regaining said step (d) to execute repetitively said steps (d), (e) and (f) until the parts of speech which all the words should be in said sentence have been determined, in case the part of speech which the word capable of functioning as multiple parts of speech should be in said sentence has been determined at said step (e). - View Dependent Claims (12, 13, 14, 15)
-
Specification