×

Diverse topic phrase extraction

  • US 8,280,877 B2
  • Filed: 09/21/2007
  • Issued: 10/02/2012
  • Est. Priority Date: 02/22/2007
  • Status: Active Grant
First Claim
Patent Images

1. A method, implemented by one or more computing devices, of summarizing search results from a corpus, the method comprising:

  • re-weighting, by the one or more computing devices, documents based at least in part on a latent topic to identify topic phrases associated with candidate phrases occurring within one or more of the documents, the re-weighting comprising strengthening a subset of the documents that are indicated by the latent topic by adjusting document weights to increase a term frequency of one or more terms within the subset of the documents associated with the latent topic;

    evaluating, by the one or more computing devices, based in part on the increase of the term frequency of the one or more terms, a modified latent semantic analysis (LSA)-weighted frequency of one or more documents of the subset of the documents to identify the topic phrases; and

    filtering, by the one or more computing devices, the documents having similar topic phrases to remove redundancy of the similar topic phrases.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×