Automatic correlation method for generating summaries for text documents
First Claim
1. An automatic method for generating summaries for text documents, comprising steps of:
- generating a set of sentences for a set of documents by document discourse analysis and a set of words by morphologic process;
initializing a score for each word in the set of words and for each sentence in the set of sentences;
computing the score for each word in the set of words according to the score of sentences containing it and the correlation degree between the word and the user information;
computing the score for each sentence in the set of sentences according to the score of words composing it and the position of the sentence in a section and a paragraph; and
if the sum of scores of the words and the sum of scores of the sentences change apparently, go back to the step of computing the word score;
otherwise continuing;
outputting the top-ranked sentences as the summary of the set of documents, the top-ranked words as the keywords list of the set of documents.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and program product to generate summaries for text documents. A user can also specify a query, topic, and terms that he/she is interested in. This method determines the importance of each sentence by using the linguistic salience of the word to the user profile, the similarity among the word, the query and topic provided by a user and the sum of scores of the sentence comprising the word. After computing the score for each word, this method computes the score for each sentence in the set of sentences according to the score of words composing it and the position of the sentence in a section and a paragraph.
-
Citations
14 Claims
-
1. An automatic method for generating summaries for text documents, comprising steps of:
-
generating a set of sentences for a set of documents by document discourse analysis and a set of words by morphologic process;
initializing a score for each word in the set of words and for each sentence in the set of sentences;
computing the score for each word in the set of words according to the score of sentences containing it and the correlation degree between the word and the user information;
computing the score for each sentence in the set of sentences according to the score of words composing it and the position of the sentence in a section and a paragraph; and
if the sum of scores of the words and the sum of scores of the sentences change apparently, go back to the step of computing the word score;
otherwise continuing;
outputting the top-ranked sentences as the summary of the set of documents, the top-ranked words as the keywords list of the set of documents. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer program product for automatically generating summaries for text documents, said computer program product comprising a computer usable medium having computer readable program code thereon, said computer readable program code comprising:
-
computer program code means for generating a set of sentences for a set of documents by document discourse analysis and a set of words by morphologic process;
computer program code means for initializing a score for each word in the set of words, and each sentence in the set of sentences;
computer program code means for computing the score for each word in the set of words according to the score of sentences containing it and the correlation degree between the word and the user information;
computer program code means for computing the score for each sentence in the set of sentences according to the score of words composing it and the position of the sentence in a section and a paragraph;
computer program code means for determining if the sum of scores of the words and the sum of scores of the sentences exhibit an apparent change; and
computer program code means for outputting the top-ranked sentences as the summary of the set of documents, the top-ranked words as the keywords list of the set of documents. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
Specification