DOCUMENT SUMMARIZER FOR WORD PROCESSORS
First Claim
1. A computer-implemented method for summarizing documents, comprising the following steps:
- counting how frequently content words appear in a document;
scoring individual sentences according to their respective content words, wherein sentences which contain more content words that appear more frequently in the document are ranked higher than sentences which contain fewer high-frequency content words and than sentences which contain content words that appear less frequently in the document;
performing a cue-phrase analysis by comparing words and phrases in the sentences with a pre-compiled list of words and phrases and setting conditions on use of sentences that contain any words or phrases on the list; and
creating a summary which contains higher ranked sentences and which may include conditioned sentences in an event that the conditions are satisfied.
1 Assignment
0 Petitions
Accused Products
Abstract
An author-oriented document summarizer for a word processor is described. The document summarizer performs a statistical analysis to generate a list of ranked sentences for consideration in the summary. The summarizer counts how frequently content words appear in a document and produces a table correlating the content words with their corresponding frequency counts. Phrase compression techniques are used to produce more accurate counts of repeatedly used phrases. A sentence score for each sentence is derived by summing the frequency counts of the content words in a sentence and dividing that tally by the number of the content words in the sentence. The sentences are then ranked in order of their sentence scores. Concurrent with the statistical analysis, during the same pass through the document the summarizer performs a cue-phrase analysis to weed out sentences with words or phrases that have been pre-identified as potential problem phrases. The cue-phrase analysis compares sentence phrases with a pre-compiled list of words and phrases and sets conditions on whether the sentences containing them can be used in the summary. Following the cue-phrase analysis, the summarizer creates a summary containing the higher ranked sentences. The summary may also include a conditioned sentence if the conditions established for inclusion of the sentence have been satisfied. The summarizer then inserts the sentence at the beginning of the document before the start of the text.
-
Citations
48 Claims
-
1. A computer-implemented method for summarizing documents, comprising the following steps:
-
counting how frequently content words appear in a document;
scoring individual sentences according to their respective content words, wherein sentences which contain more content words that appear more frequently in the document are ranked higher than sentences which contain fewer high-frequency content words and than sentences which contain content words that appear less frequently in the document;
performing a cue-phrase analysis by comparing words and phrases in the sentences with a pre-compiled list of words and phrases and setting conditions on use of sentences that contain any words or phrases on the list; and
creating a summary which contains higher ranked sentences and which may include conditioned sentences in an event that the conditions are satisfied. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A computer-implemented method for summarizing documents, comprising the following steps:
-
counting how frequently content words appear in a document to yield frequency counts for the content words;
correlating the content words with their corresponding frequency counts;
deriving a sentence score for individual sentences based upon the frequency counts of the content words;
ranking the sentences in order of sentence scores, wherein higher ranking sentences have-comparatively higher sentence scores and lower ranking sentences have comparatively lower sentence scores;
performing a cue-phrase analysis to identify (1) sentences with certain types of phrases as prohibited sentences and (2) sentences with other types of phrases that are rely on context of other sentences as conditioned sentences; and
creating a summary which contains the higher ranked sentences and which may include a conditioned sentence in an event that conditions are satisfied, but which excludes the prohibited sentences. - View Dependent Claims (19, 20, 21, 22, 23, 24, 25, 26, 27, 28)
-
- 29. In a document summarizing application executing on a processing device to construct a summary of a document, a computer-implemented method comprising the step of inserting the summary at a beginning of the document.
-
35. In a document summarizing application executing on a processing device to construct a summary of a document, wherein the summary is in part derived from counting how frequently words appear in the document, a computer-implemented method comprising the following steps:
-
evaluating the words for ordered sets of words that appear repeatedly in a same order; and
counting the ordered sets of words as if they were single words. - View Dependent Claims (36, 37, 38, 39, 40)
-
-
41. A computer-implemented method for summarizing documents, comprising the following steps:
-
(a) counting how frequently content words appear in a document to produce frequency counts for corresponding content words;
(b) scoring individual sentences according to the content words contained in the sentences;
(c) identifying a sentence with the highest score;
(d) adjusting the frequency counts of the content words that appear in the highest scoring sentence; and
(e) re-scoring the sentences. - View Dependent Claims (42, 43, 44, 45, 46, 47, 48)
-
Specification