Method, apparatus, and computer program product for generating a summary of a document based on common expressions appearing in the document
First Claim
1. A method of preparing a document summary which comprises the steps of:
- extracting sentence-constituting-elements from the document;
classifying said sentence-constituting-elements into categories;
extracting commonly-held-information which is common to said sentence-constituting-elements in the same;
looking up common expression information which is common to plural pieces of said commonly-held-information in a thesaurus in which said commonly-held-information and said common expression information are connected by a hierarchical tree; and
composing the document summary based on said commonly-held-information and said common expression information;
wherein said categories include “
When”
, “
Where”
, “
Who”
, “
What”
, “
Why”
, “
How” and
“
Done”
.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of summarizing a document which comprises the steps of: extracting sentence-constituting-elements from the document; tabularizing the sentence-constituting-elements corresponding to categories and sentences in the document; extracting commonly-held-information which is common to the sentence-constituting-elements in the same category from the sentence-constituting-elements; looking up common expression information which is common to plural pieces of the commonly-held-information in a thesaurus in which the commonly-held-information and the common expression information are connected by a hierarchical tree; and composing a summary based on the commonly-held-information and the common expression information.
54 Citations
18 Claims
-
1. A method of preparing a document summary which comprises the steps of:
-
extracting sentence-constituting-elements from the document;
classifying said sentence-constituting-elements into categories;
extracting commonly-held-information which is common to said sentence-constituting-elements in the same;
looking up common expression information which is common to plural pieces of said commonly-held-information in a thesaurus in which said commonly-held-information and said common expression information are connected by a hierarchical tree; and
composing the document summary based on said commonly-held-information and said common expression information;
wherein said categories include “
When”
, “
Where”
, “
Who”
, “
What”
, “
Why”
, “
How” and
“
Done”
.- View Dependent Claims (2, 3, 4, 5)
extracting commonly-occurring-information which is common to several sentences in said document in one or more categories; and
linking said commonly-occurring-information to said commonly-held-information or said common expression information.
-
-
4. The method as set forth in claim 1, wherein priority is given to a part of said categories corresponding to a user'"'"'s operation.
-
5. The method as set forth in claim 1, wherein each sentence of the document summary is composed by selectively combining pieces of said commonly-held-information and said common expression information.
-
6. An apparatus for generating a document summary which comprises:
-
means for extracting sentence-constituting-elements from the document;
means for classifying said sentence-constituting-elements into categories;
means for extracting commonly-held-information which is common to said sentence-constituting-elements in the same;
a thesaurus in which said commonly-held-information and common expression information are connected by a hierarchical tree;
means for looking up said common expression information which is common to plural pieces of said commonly-held-information in said thesaurus; and
means for composing the document summary based on said commonly-held-information and said common expression information;
wherein said categories include “
When”
, “
Where”
, “
Who”
, “
What”
, “
Why”
, “
How” and
“
Done”
.- View Dependent Claims (7, 8, 9, 10)
means for extracting commonly-occurring-information which is common to several sentences in said document in one or more categories; and
means for linking said commonly-occurring information to said commonly-held-information or said common expression information.
-
-
11. A method for generating a summary sentence from a document, said method comprising the steps of:
-
extracting sentence-constituting-elements from each sentence in the document;
classifying said sentence-constituting-elements into categories for each sentence in the document;
extracting a common word which is common to the sentence-constituting-elements extracted from all of or a part of the sentences in the document for each category;
looking up a generic word which is common to the sentence-constituting-elements extracted from all of or a part of the sentences in the document from a thesaurus in which generic words and specific words are connected in a hierarchical tree for each category on behalf of the extracting step if possible; and
composing the summary sentence comprising the extracted common words and the looked-up generic words, wherein said categories include “
When”
, “
Where”
, “
Who”
, “
What”
, “
Why”
, “
How” and
“
Done”
.
-
-
12. An apparatus for generating a summary sentence from a document, said apparatus comprising:
-
means for extracting sentence-constituting-elements from each sentence in the document;
means for classifying said sentence-constituting-elements into categories for each sentence in the document;
means for extracting a common word which is common to the sentence-constituting-elements extracted from all of or a part of the sentences in the document for each category;
means for looking up a generic word which is common to the sentence-constituting-elements extracted from all of or a part of the sentences in the document from a thesaurus in which generic words and specific words are connected in a hierarchical tree for each category on behalf of the extracting step if possible; and
means for composing the summary sentence comprising the extracted common words and the looked-up generic words;
wherein said categories include “
When”
, “
Where”
, “
Who”
, “
What”
, “
Why”
, “
How” and
“
Done”
.
-
-
13. A computer program product comprising a computer useable medium having a computer program logic stored therein, said computer program logic causing a computer to execute the steps of:
-
extracting sentence-constituting-elements from each sentence in the document;
classifying said sentence-constituting-elements into categories for each sentence in the document;
extracting a common word which is common to the sentence-constituting-elements extracted from all of or a part of the sentences in the document for each category;
looking up a generic word which is common to the sentence-constituting-elements extracted from all of or a part of the sentences in the document from a thesaurus in which generic words and specific words are connected in a hierarchical tree for each category on behalf of the extracting step if possible; and
composing the summary sentence comprising the extracted common words and the looked-up generic words;
wherein said categories include “
When”
, “
Where”
, “
Who”
, “
What”
, “
Why”
, “
How” and
“
Done”
.
-
-
14. A computer program product comprising a computer useable medium having structured data and computer program logic stored therein,
said structured data comprising: -
a thesaurus in which commonly-held-information and common expression information are connected by a hierarchical tree; and
said computer program logic comprising;
means for extracting sentence-constituting-elements from the document;
means for classifying said sentence-constituting-elements into categories;
means for extracting commonly-held-information which is common to said sentence-constituting-elements in the same category;
means for looking up said common expression information which is common to plural pieces of said commonly-held-information in said thesaurus; and
means for composing the document summary based on said commonly-held-information and said common expression information;
wherein said categories include “
When”
, “
Where”
, “
Who”
, “
What”
, “
Why”
, “
How” and
“
Done”
.- View Dependent Claims (15, 16, 17, 18)
means for extracting commonly-occurring-information which is common to several sentences in said document in one or more categories; and
means for linking said commonly-occurring-information to said commonly-held-information or said common expression information.
-
-
17. The computer program product as set forth in claim 14, wherein said computer program logic further comprises means for giving priority to a part of said categories corresponding to a user'"'"'s operation.
-
18. The computer program product as set forth in claim 14, wherein said means for composing produces each sentence of the document summary and includes means for selectively combining pieces of said commonly-held-information and said common expression information to produce each sentence.
Specification