Information extraction processor
First Claim
Patent Images
1. An information extraction processor comprising:
- a text input unit for receiving an object text described in a natural language;
a syntactic dictionary for storing morphemes with syntactic attributes;
a keyword dictionary for storing keywords of information to be extracted, and a role of each keyword to be performed by said keyword at an output stage of said information;
a morphological analyzer connected to said text input unit for dividing said text input into morphemes composing said text, and is connected to said syntactic dictionary and to said keyword dictionary for assigning contents of said syntactic dictionary and said keyword dictionary to each morpheme;
a syntax rule file for storing rules for analyzing sentence structure by using said syntactic attributes stored in said syntactic dictionary and said information stored in said keyword dictionary;
a keywords interrelation rule file for storing rules for generating a semantic structure of indicating relations between keywords through controlling the syntax rule by keyword information assigned to keywords;
an information extraction unit connected to said morphological analyzer, to said syntax rules file, and to said keywords interrelationship rule file for analyzing a sequence of morphemes received from said morphological analyzer, with syntax rules stored in said syntax rule file and with keyword interrelation rules stored in said keywords interrelation rule file, to generate a semantic structure indicating relations between keywords; and
an output unit connected to said information extraction unit for converting said semantic structure indicating relations between keywords to displayed image patterns.
1 Assignment
0 Petitions
Accused Products
Abstract
In a processor for extracting information on a specified field from a text described in a natural language, keywords and structural analysis are jointly used to improve the performance. When a set of keywords is divided in more than one sentence, this set of keywords is assembled by context defining words in a sentence. A multi-language summary generator uses this type of a processor.
56 Citations
9 Claims
-
1. An information extraction processor comprising:
-
a text input unit for receiving an object text described in a natural language; a syntactic dictionary for storing morphemes with syntactic attributes; a keyword dictionary for storing keywords of information to be extracted, and a role of each keyword to be performed by said keyword at an output stage of said information; a morphological analyzer connected to said text input unit for dividing said text input into morphemes composing said text, and is connected to said syntactic dictionary and to said keyword dictionary for assigning contents of said syntactic dictionary and said keyword dictionary to each morpheme; a syntax rule file for storing rules for analyzing sentence structure by using said syntactic attributes stored in said syntactic dictionary and said information stored in said keyword dictionary; a keywords interrelation rule file for storing rules for generating a semantic structure of indicating relations between keywords through controlling the syntax rule by keyword information assigned to keywords; an information extraction unit connected to said morphological analyzer, to said syntax rules file, and to said keywords interrelationship rule file for analyzing a sequence of morphemes received from said morphological analyzer, with syntax rules stored in said syntax rule file and with keyword interrelation rules stored in said keywords interrelation rule file, to generate a semantic structure indicating relations between keywords; and an output unit connected to said information extraction unit for converting said semantic structure indicating relations between keywords to displayed image patterns. - View Dependent Claims (2)
-
-
3. An information extraction processor comprising:
-
a text input unit for receiving an object text described in a natural language; a text divide unit connected to said text input unit for dividing said text received from said text input unit into text segments; a context defining word file for storing information on words in a specific field of interest for defining context; an information extraction unit connected to said text divide unit and to said context defining word file for extracting text segment information from said text segments received from said text divide unit, and for detecting context defining words in said text segments with reference to said information on words defining context stored in said context defining word file; a context defining words relation rule file for storing relation rules between context defining words; an information synthesizing unit connected to said text information extraction unit and to said context defining words relation rule file for synthesizing information on said input text, from said text segment information extracted by said information extraction unit and appended by information on detected context defining words, in accordance with relation rules between context defining words stored in said context defining words relation rule file; and an output unit connected to said information synthesizing unit for producing displayed image patterns of said information synthesized at said information synthesizing unit. - View Dependent Claims (4, 5, 6, 7, 8)
-
-
9. An information extraction processor for generating a multi-language summary comprising:
-
a text input unit for receiving an object text described in a natural language; a text information extraction unit connected to said text input unit for extracting specified information as a semantic structure in accordance with keywords and a sentence structure; an interlingual expression generation rule file for storing rules for converting said semantic structure extracted at said text information extraction unit, into an interlingual expression expressing a sentence, or, when necessary, into plural interlingual expressions expressing sentences; an interlingual expression generator connected to said text information extraction unit and to said interlingual expression generation rule file for converting said semantic structure extracted at said text information extraction unit, into an interlingual expression expressing a sentence, or when necessary, into plural interlingual expressions expressing sentences, referring to rules stored in said interlingual expression rule file; a dictionary of a target language for storing correspondence between an interlingual expression and a word of target language; a target language sentence generation rule file for storing rules for composing a sentence of said target language from said interlingual expression expressing a sentence; and a sentence generator connected to said interlingual expression generator, to said dictionary of a target language, and to said target language sentence generation rule file for composing a sentence in a natural language from said interlingual expression in accordance with information stored in said dictionary of target language and with rules stored in said target language sentence generation rule file.
-
Specification