Document summarizing apparatus, document summarizing method and recording medium carrying a document summarizing program
First Claim
1. A document summarizing apparatus for generating a summary of a set of documents, comprising:
- a sentence analyzing unit that analyzes the syntax of sentences contained in the documents specified to be processed to generate an analysis graph describing the relational dependencies between words;
an analysis graph scoring unit that scores the analysis graph generated by said sentence analyzing unit based on importance;
an analysis graph score accumulating unit that stores the analysis graphs scored by said analysis graph scoring unit to combine the analysis graphs having the same concept to increase the scores given to the combined analysis graphs according to the combined contents; and
a sentence synthesizing unit that selects graphs with higher scores from a group of analysis graphs stored in said analysis graph score accumulating unit when the analysis graphs have been generated from specified documents to be processed and accumulated in said analysis graph score accumulating unit, in order to synthesize a summarizing sentence based on the selected analysis graphs.
1 Assignment
0 Petitions
Accused Products
Abstract
A document summarizing apparatus generates a comprehensive summary on a group of documents of relatively diverse contents. The structure of documents specified to be processed is analyzed in a phrase analyzing unit to generate analytic trees describing the dependencies between words. An analytic tree scoring unit adds scores to the analytic trees in accordance with their importance. An analytic tree score accumulating unit accumulates scored trees to unify the trees expressing the same concept to increases the scores added to the unified analytic trees. A sentence synthesizing unit then selects the trees with higher scores from within the set of analytic trees stored in the analytic tree score accumulating unit to synthesize a summary from the selected analytic trees. The present invention allows less limitation to be applied to the documents to be processed, as well as a comprehensive summary to be generated.
-
Citations
8 Claims
-
1. A document summarizing apparatus for generating a summary of a set of documents, comprising:
-
a sentence analyzing unit that analyzes the syntax of sentences contained in the documents specified to be processed to generate an analysis graph describing the relational dependencies between words;
an analysis graph scoring unit that scores the analysis graph generated by said sentence analyzing unit based on importance;
an analysis graph score accumulating unit that stores the analysis graphs scored by said analysis graph scoring unit to combine the analysis graphs having the same concept to increase the scores given to the combined analysis graphs according to the combined contents; and
a sentence synthesizing unit that selects graphs with higher scores from a group of analysis graphs stored in said analysis graph score accumulating unit when the analysis graphs have been generated from specified documents to be processed and accumulated in said analysis graph score accumulating unit, in order to synthesize a summarizing sentence based on the selected analysis graphs. - View Dependent Claims (2, 3, 4, 5, 6)
an analysis graph expanding unit that expands the analysis graph generated by said sentence analyzing unit to generate subgraphs thereof, wherein said analysis graph scoring unit scores in compliance with the importance level by treating the subgraphs generated by said analysis graph expanding unit as independent analysis graphs.
-
-
3. The document summarizing apparatus according to claim 1, further comprising:
-
a word scoring unit that calculates the importance score for each elementary word contained in the documents specified to be processed, wherein said analysis graph scoring unit calculates the score of analysis graphs by using said importance score calculated in said word scoring unit for each elementary word of said analysis graphs.
-
-
4. The document summarizing apparatus according to claim 1, further comprising:
-
a thesaurus that manages the containment of meaning between words; and
an analysis graph translating unit that uses said thesaurus to convert from the analysis graphs generated by said sentence analyzing unit to analysis graphs that the elementary words are translated into words having a concept semantically related, wherein said analysis graph scoring unit evaluates the analysis graphs yielded by said analysis graph translating unit at a lower score than the original analysis graphs thereof in response to the translated level.
-
-
5. The document summarizing apparatus according to claim 1, further comprising:
-
a relational table holding unit that holds a relational table defining translation rules for translating relational dependencies between words without altering the meaning of sentence, wherein said analysis graph score accumulating unit detects a pair of analysis graphs that ultimately results in the identical analysis graph when said analysis graphs are translated in compliance with the relational table held by said relational table holding unit so as to subtract the score of one analysis graph of the pair in response to the translation level and to add that score to the score of another analysis graph of the pair.
-
-
6. The document summarizing apparatus according to claim 1, wherein
said sentence synthesizing unit includes patterns for synthesizing said summary, said patterns corresponding to said analysis graphs and its styles in order to select one pattern for synthesizing said summary when said analysis graph and said style are supplied to said unit.
-
7. The document summarizing method for generating a summary from a group of documents, comprising the steps of:
-
analyzing the syntax of sentences contained in the documents specified to be processed to generate an analysis graph describing the relational dependencies between words;
scoring the analysis graph generated by the sentence analyzing unit based on importance;
storing the scored analysis graphs to combine the analysis graphs having the same concept one with another to increase the scores given to the combined analysis graphs according to the combined contents;
synthesizing a summarizing sentence based on the selected analysis graphs by selecting graphs with higher scores from the group of stored analysis graphs when the analysis graphs have been generated and accumulated from specified documents to be processed.
-
-
8. A computer-readable recording medium carrying a document summarizing program for generating by a computer a summary from a set of documents comprising a document summarizing program for use with a computer, including:
-
a sentence analyzing unit that analyzes the syntax of sentences contained in the documents specified to be processed to generate an analysis graph describing the relational dependencies between words;
an analysis graph scoring unit that scores the analysis graph generated by said sentence analyzing unit based on importance;
an analysis graph score accumulating unit that stores the analysis graphs scored by said analysis graph soaring unit to combine the analysis graphs having the same concept to increase the scores given to the combined analysis graphs according to the combined contents; and
a sentence synthesizing unit that selects graphs with higher scores from the group of analysis graphs stored in said analysis graph score accumulating unit when the analysis graphs have been generated from specified documents to be processed and accumulated in said analysis graph score accumulating unit, in order to synthesize a summarizing sentence based on the selected analysis graphs.
-
Specification