Text structure analyzing apparatus, abstracting apparatus, and program recording medium
First Claim
1. A text structure analyzing apparatus analyzing a connection between respective elements constituting a text and based on an analyzed result, indicating a structure of the text by means of a tree structure which represents the respective elements as nodes, comprising:
- an element appearance position storing section dividing an inputted text into the elements and storing an appearance position relationship among the elements on the inputted text;
a relation degree computing section determining a precedent element of an attention element with reference to the appearance position relationship and computing a relation degree representing strength of a connection between the attention element and each precedent element;
an importance degree computing section computing an importance degree of the attention element, based on a relation degree between the attention element and each precedent element and an importance degree of a head element of the inputted text;
a structure determining section determining a tree structure of the inputted text by determining the precedent element having an optimum value as an importance degree of the attention element as a parent element of the attention element; and
an output section outputting the determined tree structure of the inputted text.
1 Assignment
0 Petitions
Accused Products
Abstract
A text input section (1) divides an inputted text into sentences and attaches a number to each of the sentences, which is stored in a text data base together with the number. An important word recognizing section (2) generates a list of important words for each sentence to store it in a storing section (8). An important word weighting section (3) weights each important word. A relation degree computing section (4) computes a relation degree between an attention sentence and a precedent sentence. An important degree computing section (5) computes an importance degree of each attention sentence. A tree structure determining section (6) determines a parent sentence of the attention sentence and determines a tree structure of the inputted text. Unlike the case of determining whether or not character strings of key words are merely coincident with each other, it is possible to determine a parent sentence of each sentence based on a degree of connection between two sentences and analyze a structure of the inputted text with high accuracy according to the above construction.
88 Citations
14 Claims
-
1. A text structure analyzing apparatus analyzing a connection between respective elements constituting a text and based on an analyzed result, indicating a structure of the text by means of a tree structure which represents the respective elements as nodes, comprising:
-
an element appearance position storing section dividing an inputted text into the elements and storing an appearance position relationship among the elements on the inputted text;
a relation degree computing section determining a precedent element of an attention element with reference to the appearance position relationship and computing a relation degree representing strength of a connection between the attention element and each precedent element;
an importance degree computing section computing an importance degree of the attention element, based on a relation degree between the attention element and each precedent element and an importance degree of a head element of the inputted text;
a structure determining section determining a tree structure of the inputted text by determining the precedent element having an optimum value as an importance degree of the attention element as a parent element of the attention element; and
an output section outputting the determined tree structure of the inputted text. - View Dependent Claims (2, 3, 4, 5)
an important word recognizing section recognizing important words from words constituting the respective elements;
and important word weighting section weighting each of the recognized important words, wherein the relation degree computing section has an important word comparing part for comparing a character string of an original form of each of the important words in the attention element with a character string of an original form of each of the important words in the precedent element to compute a relation degree between the attention element and the precedent element, based on a total value of weights of all the important words common to the attention element and to the precedent element and a number of all the important words in the attention element or a number of all the important words in the precedent element.
-
-
4. A text structure analyzing apparatus according to claim 3, comprising:
-
an important word information storing section in which parts of speech to be recognized as the important words are stored, wherein the important word recognizing section has a part of speech recognizing section for recognizing parts of speech in the respective elements; and
a part of speech comparing section for comparing the recognized parts of speech and parts of speech to be recognized as the important words with each other to recognize words corresponding to parts of speech to be recognized as the important words from among words in the respective elements.
-
-
5. A text structure analyzing apparatus according to claim 1, comprising:
-
an important word recognizing section recognizing important words from words constituting the elements;
a meaning recognizing section recognizing meaning of each of the recognized important words; and
a concept system storing section storing a concept system for recognizing rank relationship between meanings of two of the recognized important words, an analogous relationship therebetween, and a part-to-whole relationship therebetween;
wherein the relation degree computing section has a determining section which regards that with reference to the concept system, one of the recognized important words in the attention element and one of the recognized important words in the precedent element have a semantic connection when the two important words have the rank relationship among meanings thereof, the analogous relationship therebetween, and the part-to-whole relationship therebetween to compute a relation degree between the attention element and the precedent element, based on a total value of weights of all the important words, having the semantic connection, in the attention element and the precedent element and the number of all the important words in the attention element or the number of all the important words in the precedent element.
-
-
6. An abstracting apparatus analyzing a connection between respective elements constituting a text and generating an abstract of the text by imparting an importance degree to the respective elements, based on an analyzed result and selecting the respective elements in the order from a higher importance degree to a lower importance degree comprising:
-
an element appearance position storing section dividing an inputted text into the elements and storing an appearance position relationship among the elements on the inputted text;
a specific word list generating section generating a list of specific words by recognizing the specific words from among words constituting a specific element and attaching the generated specific word list to a front of a head element of the inputted text;
a relation degree computing section determining a precedent element of an attention element with reference to the appearance position relationship in which the specific word list is set as a head element and computing a relation degree representing strength of a connection between the attention element and each precedent element;
an importance degree computing section computing an importance degree of the attention element, based on a relation degree between the attention element and each precedent element and an importance degree of the specific word list, an element selection section selecting a predetermined number of elements in a descending order from an element having a highest importance degree obtained by computation; and
an output section outputting the selected predetermined number of elements as an abstract of the inputted text. - View Dependent Claims (7, 8)
wherein the specific word list generating section has a part of speech recognizing section for recognizing parts of speech of words constituting an element representing a title; - and a part of speech comparing section for comparing the recognized part of speech and the parts of speech to be recognized as the specific words with each other to recognize as the specific word a word corresponding to the parts of speech to be recognized as the specific word from among the words constituting the element representing the title.
-
-
9. An abstracting apparatus analyzing a connection between respective elements constituting a text and generating an abstract of the text by imparting an importance degree to the respective elements, based on an analyzed result and selecting the respective elements in the order from a higher importance degree to a lower importance degree, comprising:
-
an element appearance position storing section dividing an inputted text into the elements and storing an appearance position relationship among the elements on the inputted text;
a fragment dividing section dividing the inputted text into larger fragments than the elements;
a specific word list generating section generating a list of specific words in each of the fragments by recognizing the specific words from among words constituting a specific element and attaching the generated specific word list to a front of a head element of the inputted text;
a relation degree computing section determining a precedent element of an attention element in each of the fragments with reference to the appearance position relationship in which the specific word list is set as a head element and computing a relation degree representing strength of a connection between the attention element and each precedent element;
an in-fragment importance degree computing section computing an importance degree of the attention element in each of the fragments, based on a relation degree between the attention element and each precedent element and an importance degree of the specific word list, a fragment importance degree setting section setting an importance degree of each fragment;
an entire importance degree computing section computing an importance degree of the attention element in the entire inputted text, based on an importance degree of the attention element in each fragment and an importance degree of the fragment to which the attention element belongs;
an element selection section selecting a predetermined number of elements in a descending order from an element having a highest importance degree, in the entire inputted text, obtained by computation; and
an output section outputting the selected predetermined number of elements as an abstract of the inputted text. - View Dependent Claims (10, 11)
a fragment importance degree storing section classifying and storing importance degrees to be imparted to the fragments according to an appearance position of each of the fragments in the inputted text, wherein the fragment importance degree setting section determines an appearance position of an attention fragment on the inputted text with reference to an appearance position relationship among the elements in which the specific word list is set as a head element and sets an importance degree of the attention fragment with reference to an appearance position of each of the fragments stored in the fragment importance degree storing section.
-
-
12. A program recording medium in which a text structure-analyzing program is recorded to function:
-
an element appearance position storing section dividing an inputted text into elements and storing an appearance position relationship among the elements on the inputted text;
a relation degree computing section computing a relation degree representing strength of connection between an attention element and each of precedent elements;
an importance degree computing section computing an importance degree of the attention element, based on a relation degree between the attention element and each precedent element and an importance degree of a head element of the inputted text;
a structure determining section determining a structure of the inputted text by setting the precedent element having an optimum value as the importance degree of the attention element as a parent element of the attention element; and
an output section outputting the determined tree structure of the inputted text.
-
-
13. A program recording medium in which a text structure-analyzing program is recorded to function:
-
an element appearance position storing section dividing an inputted text into elements and storing an appearance position relationship among the elements on the inputted text;
a specific word list generating section generating a list of specific words by recognizing the specific words from among words constituting a specific element and attaching the generated specific word list to a front of a head element of the inputted text;
a relation degree computing section computing a relation degree representing the strength of connection between an attention element and each precedent element;
an importance degree computing section computing an importance degree of the attention element, based on a relation degree between the attention element and each precedent element and an importance degree of the list of specific words;
an element selection section selecting a predetermined number of elements in a descending order from an element having a highest importance degree obtained by the computation; and
an output section outputting the selected predetermined number of elements as an abstract of the inputted text.
-
-
14. A program recording medium in which a text structure-analyzing program is recorded to function:
-
an element appearance position storing section dividing an inputted text into elements and storing an appearance position relationship among the elements on the inputted text;
a fragment dividing section dividing the inputted text into larger fragments than the elements;
a specific word list generating section generating a list of specific words by recognizing the specific words from among words constituting a specific element in each fragment and attaching the generated list of specific words to a front of a head element of each fragment;
a relation degree computing section computing a relation degree representing strength of connection between the attention element and each precedent element in each fragment;
an in-fragment importance degree computing section computing an importance degree of the attention element, based on a relation degree between the attention element and each precedent element in each of the fragments and an importance degree of the list of specific words;
a fragment importance degree setting section setting an importance degree of each of the fragments;
an entire importance degree computing section computing an importance degree of the attention element in the entire inputted text, based on an importance degree of the attention element in each fragment and an importance degree of the fragment to which the attention element belongs;
an element selection section selecting a predetermined number of elements in a descending order from an element having a highest importance degree, in the entire inputted text, obtained by computation; and
an output section outputting the selected predetermined number of elements as an abstract of the inputted text.
-
Specification