Multiple-parts-of-speech disambiguating method and apparatus for machine translation system

US 4,661,924 A
Filed: 07/19/1985
Issued: 04/28/1987
Est. Priority Date: 07/31/1984
Status: Expired due to Term

First Claim

Patent Images

1. A part-of-speech disambiguating apparatus for a machine translation system, comprising:

input means for inputting a sentence constituted by words to be disambiguated with respect to the parts of speech which the words function as, respectively;

a dictionary memory for storing a number of words belonging to a language used in said sentence together with respective parts of speech and appearance frequencies thereof;

a rule table memory for storing a table containing parts-of-speech disambiguating rules for specifying the parts of speech of the words each of which can function as multiple parts of speech from an array of parts of speech of successive words in which said words whose parts of speech are to be specified are included; and

a processor connected to said input means, said dictionary memory and said rule table memory, wherein said processor receives said sentence from said input means and determines the parts of speech of the words which are contained in said sentence and capable of functioning as multiple parts of speech, on the basis of data read out from said rule table memory, while determining the parts of speech on the basis of said appearance frequencies read out from said dictionary memory for those words whose parts of speech could not been determined on the basis of the data read out from said rule table memory.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A machine translation system comprises input means for inputting a sentence written in a natural language, processor for parsing the input sentence, a word dictionary memory referred to by the processor, and a memory for storing multiple-parts-of-speech disambiguating rules in the form of a table. The parts of speech of words capable of functioning as multiple parts of speech should be in the inputted sentence are determined in consideration of an array of the parts of speech by applying the multiple-parts-of-speech disambiguating rules. Additionally, rate of appearance of each part of speech which the word of the input sentence can function as is previously calculated, and the part of speech which can not be determined by consulting the disambiguating rule table is determined in dependence on whether the rate of appearance exceeds a predetermined threshold value.

Citations

15 Claims

1. A part-of-speech disambiguating apparatus for a machine translation system, comprising:
- input means for inputting a sentence constituted by words to be disambiguated with respect to the parts of speech which the words function as, respectively;
  
  a dictionary memory for storing a number of words belonging to a language used in said sentence together with respective parts of speech and appearance frequencies thereof;
  
  a rule table memory for storing a table containing parts-of-speech disambiguating rules for specifying the parts of speech of the words each of which can function as multiple parts of speech from an array of parts of speech of successive words in which said words whose parts of speech are to be specified are included; and
  
  a processor connected to said input means, said dictionary memory and said rule table memory, wherein said processor receives said sentence from said input means and determines the parts of speech of the words which are contained in said sentence and capable of functioning as multiple parts of speech, on the basis of data read out from said rule table memory, while determining the parts of speech on the basis of said appearance frequencies read out from said dictionary memory for those words whose parts of speech could not been determined on the basis of the data read out from said rule table memory.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. A part-of-speech disambiguating apparatus according to claim 1, wherein for the undetermined word for which the part of speech can not be determined on the basis of said appearance frequency, said processor calculates rates of appearance of the parts of speech which said undetermined word can function as, compares the calculated values with a preset threshold value, and determine the part of speech corresponding to the calculated value as the part of speech for said undetermined word.
  - 3. A part-of-speech disambiguating apparatus according to claim 2, wherein said processor again reads data from said rule table memory for determining on the basis of said data the part of speech of other word of said sentence which can function as multiple parts of speech and whose part of speech in said sentence is not yet determined, in succession to the determination of the part of speech of the word through comparison of said calculated value with said threshold value.
  - 4. A part-of-speech disambiguating apparatus according to claim 2, wherein unless the part of speech of the word in said sentence which word can function as multiple parts of speech is determined through comparison of said calculated value with said threshold value, said processor changes said threshold value to perform again comparison of said calculated value with the changed threshold value to determine the part of speech for the word which can function as multiple parts of speech and those part of speech in said sentence is not yet determined.
  - 5. A part-of-speech disambiguating apparatus according to claim 4, wherein in case said processor could determine the part of speech of said word through comparison of said calculated value with said changed threshold value, said processor reads out again data from said rule table memory for determining on the basis of said data the part of speech for other word which can function as multiple parts of speech and whose part of speech in said sentence is not yet determined.
  - 6. A part-of-speech disambiguating apparatus according to claim 1, wherein said machine translation system further includes a main memory, said processor reading out from said dictionary memory the words constituting said sentence inputted through said input means as well as parts of speech which said words can function as and frequencies of appearance of said parts of speech, whereby an internal processing table is prepared in said main memory of said machine translation system, said table containing said words, said parts of speech and said frequency of appearance read out from said dictionary memory.
  - 7. A part-of-speech disambiguating apparatus according to claim 1, wherein said rule table memory includes an array comprising rowwise a first column containing a combination of plural parts of speech and at least one second column located adjacent rowwise to said first column and containing a single part of speech, said memory further including a column designating a part of speech belonging to said combination, said part of speech being determined on the basis of said array.
  - 8. A part-of-speech disambiguating apparatus according to claim 1, wherein said dictionary memory further stores attributes of each of said words.
  - 9. A part-of-speech disambiguating apparatus according to claim 1, said machine translation system further including display means, wherein said processor displays on said display means the determined parts of speech which the words function as in said sentence, so that operator can alter the part of speech determined by said processor with the aid of said input means.
  - 10. A part-of-speech disambiguating apparatus according to claim 9, wherein when the operator altered a part-of-speech determined by said processor, said processor alters the corresponding frequencies of appearance stored in said dictionary memory.

11. A method of disambiguating a part of speech which a word capable of functioning as multiple parts of speech should be, in a machine translation system comprising steps of:
- (a) preparing a dictionary memory for storing a number of words of a natural language together with parts of speech which said words can function as and frequencies of appearance of said parts of speech;
  
  (b) preparing a rule table memory for storing a table containing part-of-speech disambiguating rules for disambiguating a part of speech of a word which is included in an array of parts of speech of successive words and which can function as multiple parts of speech;
  
  (c) inputting a sentence written in said natural language and to be disambiguated in respect to the parts of speech which the words of said sentence should function as;
  
  (d) determining parts of speech which the words each capable of functioning as multiple parts of speech should be in said sentence, on the basis of data read out from said rule table memory;
  
  (e) determining parts of speech which the words capable of functioning as multiple parts of speech should be in said sentence, on the basis of said appearance frequency of the parts of speech which said words can function as and which is read from said dictionary memory, when said parts of speech could not be determined at said step (d);
  
  (f) regaining said step (d) to execute repetitively said steps (d), (e) and (f) until the parts of speech which all the words should be in said sentence have been determined, in case the part of speech which the word capable of functioning as multiple parts of speech should be in said sentence has been determined at said step (e).
- View Dependent Claims (12, 13, 14, 15)
- - 12. A part-of-speech disambiguating method according to claim 11, further comprising a step of calculating rate of appearance of each of multiple parts of speech which the word can function as in said sentence and which is not yet determined, on the basis of said appearance frequency,wherein at said step (e), said rate of appearance thus calculated is compared with a predetermined threshold value to thereby determine the part of speech which the word capable of functioning as multiple parts of speech should be in said sentence and which has not yet been determined.
  - 13. A part-of-speech disambiguating method according to claim 12, further comprising a step of changing said threshold value, when the part of speech of the word capable of functioning as multiple parts of speech could not be determined at said step (e).
  - 14. A part-of-speech disambiguating method according to claim 11, further comprising a step of displaying the parts of speech for each of the words contained in said sentence on display means in succession to said step (f), and a step of altering, if necessary, the determined parts of speech through said input means while observing the display on said display means.
  - 15. A part-of-speech disambiguating method according to claim 12, further comprising a step of the appearance frequency contained in said dictionary memory in correspondence to the part of speech altered at said part-of-speech altering step.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Hitachi, Ltd.
Original Assignee
Hitachi, Ltd.
Inventors
Okamoto, Eri, Okajima, Atsushi
Primary Examiner(s)
Zache, Raulfe B.

Application Number

US06/756,670
Time in Patent Office

648 Days
Field of Search

364/200 MS File, 364/900 MS File, 364/419, 434/156
US Class Current

704/8
CPC Class Codes

G06F 40/205   Parsing

G06F 40/211   Syntactic parsing, e.g. bas...

G06F 40/55   Rule-based translation

Multiple-parts-of-speech disambiguating method and apparatus for machine translation system

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

Multiple-parts-of-speech disambiguating method and apparatus for machine translation system

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links