×

Query-based snippet clustering for search result grouping

  • US 7,617,176 B2
  • Filed: 07/13/2004
  • Issued: 11/10/2009
  • Est. Priority Date: 07/13/2004
  • Status: Active Grant
First Claim
Patent Images

1. A computer implemented system that facilitates clustering of search results, comprising:

  • a memory having stored therein computer-executable instructions configured to implement the clustering system, including;

    an input component that receives search results, wherein the search results are a ranked list of titles and snippets associated with documents;

    an analysis component that;

    utilizes a frequent itemset algorithm to extract keywords from the search results and identifies frequently occurring words as keywords, with keywords occurring in titles being weighted more heavily than keywords occurring in snippets,calculates properties for each keyword, wherein the properties include phrase frequency and inverted document frequency, phrase length, intra-cluster similarity, cluster entropy and phrase independence,applies a regression model learned from training data that is collected in advance to combine the properties into a salience score for each of the keywords,ranks the keywords in descending order according to their associated salience scores andselects the highest ranked keywords as salient keywords; and

    a clustering component that;

    generates one or more candidate clusters of the search results according to the saliency score, wherein the salient keywords are selected as names of the candidate clusters, the names of the candidate clusters being phrases when the salient keywords are merged with other salient keywords,merges the candidate clusters into one or more final clusters,merges a first cluster and a second cluster into a third cluster, when overlap of the first and second clusters exceeds a predetermined threshold,adjusts cluster names of the one or more final clusters to generate a new cluster name for the third cluster, andoutputs the search results as a ranked list of associated documents.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×