Document summarizer for word processors
First Claim
1. In a document summarizing application executing on a processing device to construct a summary of a document, wherein the summary is in part derived from counting how frequently words appear in the document, a computer-implemented method comprising:
- evaluating words in the document to identify ordered sets of words that appear repeatedly in a same order;
ranking individual sentences in the document by treating the ordered sets of words as if they were single words;
generating the summary based at least in part on the sentence rankings;
inserting the summary into a file comprising the document; and
saving the file to non-volatile data storage.
1 Assignment
0 Petitions
Accused Products
Abstract
An author-oriented document summarizer for a word processor is described. The document summarizer performs a statistical analysis to generate a list of ranked sentences for consideration in the summary. The summarizer counts how frequently content words appear in a document and produces a table correlating the content words with their corresponding frequency counts. Phrase compression techniques are used to produce more accurate counts of repeatedly used phrases. A sentence score for each sentence is derived by summing the frequency counts of the content words in a sentence and dividing that tally by the number of the content words in the sentence. The sentences are then ranked in order of their sentence scores. Concurrent with the statistical analysis, during the same pass through the document the summarizer performs a cue-phrase analysis to weed out sentences with words or phrases that have been pre-identified as potential problem phrases. The cue-phrase analysis compares sentence phrases with a pre-compiled list of words and phrases and sets conditions on whether the sentences containing them can be used in the summary. Following the cue-phrase analysis, the summarizer creates a summary containing the higher ranked sentences. The summary may also include a conditioned sentence if the conditions established for inclusion of the sentence have been satisfied. The summarizer then inserts the sentence at the beginning of the document before the start of the text.
73 Citations
8 Claims
-
1. In a document summarizing application executing on a processing device to construct a summary of a document, wherein the summary is in part derived from counting how frequently words appear in the document, a computer-implemented method comprising:
-
evaluating words in the document to identify ordered sets of words that appear repeatedly in a same order; ranking individual sentences in the document by treating the ordered sets of words as if they were single words; generating the summary based at least in part on the sentence rankings; inserting the summary into a file comprising the document; and saving the file to non-volatile data storage. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A computer-readable medium comprising computer-program instructions executable by a processor to construct a summary of a document, wherein the summary is in part derived from counting how frequently words appear in the document, the computer-program instructions comprising instructions for:
-
evaluating words in the document to identify ordered sets of words that appear repeatedly in a same order; ranking individual sentences in the document by treating the ordered sets of words as if they were single words; generating the summary based at least in part on the sentence rankings; inserting the summary into a file comprising the document; and saving the file to non-volatile data storage.
-
-
8. A computing device comprising:
-
a processor; and a memory coupled to the processor, memory comprising computer-program instructions executable by the processor to construct a summary of a document, wherein the summary is in part derived from counting how is frequently words appear in the document, the computer-program instructions comprising instructions for; evaluating words in the document to identify ordered sets of words that appear repeatedly in a same order; ranking individual sentences in the document by treating the ordered sets of words as if they were single words; generating the summary based at least in part on the sentence rankings; inserting the summary into a file comprising the document; and saving the file to non-volatile data storage.
-
Specification