Method for automatically generating a summarized text by a computer
First Claim
Patent Images
1. A method for automatic generation of a summary of a text by a computer, comprising the steps of:
- calculating for each sentence a probability that the sentence belongs to the summary, including determining a relevance measure for each word in the sentence from a lexicon that contains application-specific words with a predetermined relevance measure for each of these words, and adding together the relevance measures for all words in the sentence to yield the probability that the sentence belongs to the summary, sorting all sentences of the text according to the probabilities, performing a predeterminable reduction measure, displaying as the summary best sentences in a sequence given by the text.
1 Assignment
0 Petitions
Accused Products
Abstract
The method enables the sentence-based automatic summarization of a text on a computer. For this purpose, subject-related lexica are used that provide a relevance measure for each word contained therein. Each sentence of the text to be summarized is processed word-by-word, and for each word an individual word frequency, weighted with the relevance measure, is cumulated. For the summarization, the n sentences having the greatest probability that they belong to the summary are assembled, whereby n is a predeterminable reduction measure.
91 Citations
4 Claims
-
1. A method for automatic generation of a summary of a text by a computer, comprising the steps of:
-
calculating for each sentence a probability that the sentence belongs to the summary, including determining a relevance measure for each word in the sentence from a lexicon that contains application-specific words with a predetermined relevance measure for each of these words, and adding together the relevance measures for all words in the sentence to yield the probability that the sentence belongs to the summary, sorting all sentences of the text according to the probabilities, performing a predeterminable reduction measure, displaying as the summary best sentences in a sequence given by the text. - View Dependent Claims (2, 3, 4)
in addition to the relevance measure, determining an individual word frequency for each word, and determining the probability that the sentence is contained in the summary by the following rule;
whereinWK(sentence) is a probability that a sentence belongs to the summary, N is a total number of words that occur in the sentence, I is a count variable (I=1,2, . . . , N) for all the words in the sentence, tf is a frequency of occurrence of the word under consideration in an entire text being summarized, and rlv is a relevance measure for the word in the sentence.
-
-
3. A method according to claim 1, further comprising the step of:
allocating the text to at least one category using an application-specific lexicon.
-
4. A method according to claim 1, further comprising the steps of:
producing for each allocation of the text to a category an application-specific summary.
Specification