Method and apparatus for generating a language independent document abstract
First Claim
1. A method of identifying a significant phrase in a document, the method comprising:
- optionally receiving from a user a user-selected phrase verbosity setting or using a system default phrase verbosity setting;
reading a sequence of words from the document in response to the user-selected or default phrase verbosity setting;
determining a score for each word in the sequence based on the length of each word;
comparing the score for each word in the sequence against a threshold score;
indicating that the sequence of words is a significant phrase if the number of words in the sequence that have the score greater than the threshold score equals or exceeds a predetermined number;
retrieving a sentence from the document, the sentence containing the sequence of words, if the sequence of words is a significant phrase; and
searching an abstract of the document to determine whether the sentence is included in the abstract.
3 Assignments
0 Petitions
Accused Products
Abstract
A method of extracting significant phrases from one or more documents stored in a computer-readable medium. A sequence of words is read from the one or more documents and a score is determined for each word in the sequence based on the length of the word. The score for each word in the sequence is compared against a threshold score. The sequence of words is indicated to be a significant phrase if the number of words in the sequences that have a score greater than the threshold score equals or exceeds a predetermined number. A sentence containing the sequence of words is retrieved from the document, if the sequence of words is a significant phrase. An abstract of the document is searched to determine if the sentence has been previously included in the abstract. If not, the sentence is added to the abstract.
-
Citations
37 Claims
-
1. A method of identifying a significant phrase in a document, the method comprising:
-
optionally receiving from a user a user-selected phrase verbosity setting or using a system default phrase verbosity setting; reading a sequence of words from the document in response to the user-selected or default phrase verbosity setting; determining a score for each word in the sequence based on the length of each word; comparing the score for each word in the sequence against a threshold score; indicating that the sequence of words is a significant phrase if the number of words in the sequence that have the score greater than the threshold score equals or exceeds a predetermined number; retrieving a sentence from the document, the sentence containing the sequence of words, if the sequence of words is a significant phrase; and searching an abstract of the document to determine whether the sentence is included in the abstract. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A method of identifying a significant phrase in a document, the method comprising:
-
receiving from a user a user-selected phrase verbosity setting or a previously selected default phrase verbosity setting; reading a sequence of words from the document in response to the default or user-selected phrase verbosity setting; determining a score for each word in the sequence based on the length of each word; comparing the score for each word in the sequence against a threshold score; indicating that the sequence of words is a significant phrase if the number of words in the sequence that have the score greater than the threshold score equals or exceeds a predetermined number; storing the sequence of words and the number of words in the sequence, if the sequence of words is a significant phrase. - View Dependent Claims (17, 18, 19, 20)
-
-
21. A computer readable medium containing executable instructions which, when executed in a processing system, cause the system to perform a method for identifying a significant phrase in a document, the method comprising:
-
receiving from a user a user-selected phrase verbosity setting; reading a sequence of words from the document in response to the user-selected phrase verbosity setting; determining a score for each word in the sequence based on the length of each word; comparing the score for each word in the sequence against a threshold score; indicating that the sequence of words is a significant phrase if the number of words in the sequence that have the score greater than the threshold score equals or exceeds a predetermined number; retrieving a sentence from the document, the sentence containing the sequence of words, if the sequence of words is a significant phrase; and searching an abstract of the document to determine whether the sentence is included in the abstract. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32)
-
-
33. A computer readable medium containing executable instructions which, when executed in a processing system, cause the system to perform a method for identifying a significant phrase in a document, the method comprising:
-
receiving from a user a user-selected phrase verbosity setting; reading a sequence of words from the document in response to the user-selected phrase verbosity setting; determining a score for each word in the sequence based on the length of each word; comparing the score for each word in the sequence against a threshold score; indicating that the sequence of words is a significant phrase if the number of words in the sequence that have the score greater than the threshold score equals or exceeds a predetermined number; storing the sequence of words and the number of words in the sequence, if the sequence of words is a significant phrase. - View Dependent Claims (34, 35, 36, 37)
-
Specification