METHODS AND SYSTEMS FOR EXTRACTING KEYPHRASES FROM NATURAL TEXT FOR SEARCH ENGINE INDEXING
First Claim
1. A computer implemented method for extracting keyphrases from natural text, characterized in that it comprises:
- (a) generating one or more phrases in the natural text based on an identification of one or more phrase, separators in the natural text;
(b) assigning a weight to each phrase based on its frequency in the natural text;
(c) ranking the phrases based on their weights to extract one or more keyphrases having the highest ranks;
0 Assignments
0 Petitions
Accused Products
Abstract
The present invention is a method and system for the extraction of keyphrases from natural text. For the purpose of this document, keyphrases are text segments that represent the main topic of a text. The method of the present invention may facilitate keyphrase extraction from any length of text. The text may be of several varieties, such as, for example a sentence, paragraph, document or collection of documents. Phrase separator methods may be applied to the text to extract phrases from the text. From these phrases the present invention may identify the one or more phrases that are integral to the meaning of the text and these may be identified as the keyphrases of the text. The text may be indexed using the keyphrases so that a search based upon any of the keyphrases will cause search engines and/or text retrieval means to retrieve the text.
17 Citations
23 Claims
-
1. A computer implemented method for extracting keyphrases from natural text, characterized in that it comprises:
-
(a) generating one or more phrases in the natural text based on an identification of one or more phrase, separators in the natural text; (b) assigning a weight to each phrase based on its frequency in the natural text; (c) ranking the phrases based on their weights to extract one or more keyphrases having the highest ranks; - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A computer implemented method for extracting keyphrases from natural text, characterized in that it comprises;
-
(a) generating one or more phrases in the natural text based on an identification of one or more phrase separators in the natural text; (b) identifying semantic frames that are associated with the one or more phrase separators, and analyzing the semantic frames so as to associate with one another phrases that have a related meaning; (c) assigning a weight to each phrase based on its frequency in the natural text and also based on the associations between each phrase and other phrases based on related meaning; and (d) ranking the phrases based on their weights to extract one or more keyphrases having the highest ranks.
-
-
22. A system having a processor and memory adapted to perform a method comprising the steps of:
-
generating one or more phrases in the natural text based on an identification of one or more phrase separators in the natural text; identifying semantic frames that are associated with the one or more phrase separators and analyzing the semantic frames so as to associate with one another phrases that have a related meaning; assigning a weight to each phrase based on its frequency in the natural text and also based on the associations between each phrase and other phrases based on related meaning; and ranking the phrases based on their weights to extract one or more keyphrases having the highest ranks.
-
-
23. A computer readable media storing computer code that when loaded into a computer device adapts the device to perform a method comprising the steps of:
-
generating one or more phrases in the natural text based on an identification of one or more phrase separators in the natural text; identifying semantic frames that are associated with the one or more phrase separators and analyzing trig semantic frames so as to associate with one another phrases that have a related meaning; assigning a weight to each phrase based on its frequency in the natural text and also based on the associations between each phrase and other phrases based on related meaning; and ranking the phrases based on their weights to extract one or more keyphrases having the highest ranks.
-
Specification