×

System and method for providing search query refinements

  • US 8,645,407 B2
  • Filed: 11/04/2011
  • Issued: 02/04/2014
  • Est. Priority Date: 09/05/2003
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method performed by a data processing apparatus, the method comprising:

  • accessing, by the data processing apparatus, a database that stores query-resource pairings and a respective weight for each query-resource pairing, each query-resource pairing identifying a single query and a single resource associated with the query;

    receiving, by the data processing apparatus, a search query from a user device;

    obtaining, by the data processing apparatus and in response to receiving the search query, search results, each search result referencing a resource and having a relevance score indicating a measure of relevance of the search result to the search query;

    identifying, by the data processing apparatus, a subset of the resources referenced by the search results, each resource in the subset matching a resource that is identified by one or more of the query-resource pairings;

    for each resource in the subset;

    generating, by the data processing apparatus, a term vector from queries identified by stored query-resource pairings that correspond to the resource, wherein each dimension of the term vector corresponds to a distinct term of the queries identified by the stored query-resource pairings that correspond to the resource, and each dimension of the term vector has a value that is based on the respective weights of each of the query-resource pairing that identifies (i) the resource and (ii) a query that includes the distinct term that corresponds to the dimension;

    multiplying, by a constant factor, the value of each dimension of the term vector that corresponds to a distinct term that matches a term included in the received search query; and

    normalizing the term vector;

    generating, by the data processing apparatus, one or more clusters of resources based on the term vectors of each resource in the subset;

    generating, by the data processing apparatus and for each cluster, a rank score based on (i) the relevance scores of the search results that reference resources that match the resources of the cluster, and (ii) a quantity of the resources of the cluster;

    selecting, by the data processing apparatus, as refinement clusters, n clusters with the highest rank score, where n is a positive integer;

    generating, by the data processing apparatus, a respective centroid for each refinement cluster, each centroid representing a weighted center of a term vector that corresponds to the refinement cluster;

    generating, by the data processing apparatus and for each refinement cluster, a cluster score for each unique query identified by stored query-resource pairings that correspond to the resources of the refinement cluster based upon (i) a quantity of the query-resource pairings that identify the unique query, and (ii) a distance from the unique query to the centroid of the potential refinement cluster;

    for each refinement cluster, selecting, by the data processing apparatus, as a representative query for the refinement cluster, a unique query that has a highest cluster score of the cluster scores for the unique queries identified by stored query-resource pairings that correspond to the resources of the refinement cluster;

    selecting, by the data processing apparatus, as a set of candidate queries, the representative queries of m refinement clusters with the highest rank score, where m is a positive integer; and

    providing, by the data processing apparatus, one or more of the candidate queries to the user device.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×