Diverse Topic Phrase Extraction
First Claim
Patent Images
1. A method of summarizing search results from a corpus comprising:
- re-weighting a plurality of weighted documents based at least in part on a latent topic to identify one or more topic phrases associated with candidate phrases occurring within one or more documents;
filtering the documents having similar topic phrases to remove redundancy of the topic phrases.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for implementing diverse topic phrase extraction are disclosed. According to one implementation, multiple word candidate phrases are extracted from a corpus and weighed. One or more documents are re-weighed to identify less obvious candidate topics using latent semantic analysis (LSA). Phrase diversification is then used to remove redundancy and select informative and distinct topic phrases.
-
Citations
20 Claims
-
1. A method of summarizing search results from a corpus comprising:
-
re-weighting a plurality of weighted documents based at least in part on a latent topic to identify one or more topic phrases associated with candidate phrases occurring within one or more documents; filtering the documents having similar topic phrases to remove redundancy of the topic phrases. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system comprising:
-
one or more processors; a memory; a re-weighting module for identifying one or more topic phrases associated with at least one document; a diversification module for removing documents associated with one or more similar concepts to result in a list of documents with unlike concepts. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. One or more computer-readable media comprising computer executable instructions that, when executed, direct a computing system to:
-
re-weighting a plurality of documents in a corpus based at least in part on a latent topic to identify one or more concepts associated with the documents; filtering the documents associated with similar concepts to result in a document collection with unlike concepts. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification