×

Ranking documents based on large data sets

  • US 10,055,461 B1
  • Filed: 07/31/2015
  • Issued: 08/21/2018
  • Est. Priority Date: 11/14/2003
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer-implemented method comprising:

  • receiving, by a distributed search system, a collection of training data comprising a plurality of training instances that each identify a respective first document selected by a particular user when the first document was identified in search results provided by the search system to the particular user in response to particular search query issued by the particular user;

    partitioning the collection of training data over a plurality of computing devices of the distributed search system;

    generating, by the distributed search system, a ranking model that produces a likelihood that a particular user will select a particular document when identified by one or more search results provided in response to a particular search query submitted by the particular user, including processing, by each computing device of the plurality of computing devices, training instances assigned to the computing device, including;

    selecting, by the computing device, a candidate condition, wherein the candidate condition specifies values for one or more user features, one or more query features, and one or more document features,sending, by the computing device, to each other computing device of the plurality of computing devices, a request to compute local statistics for the candidate condition,receiving, by the computing device from each other computing device of one or more other computing devices, respective computed statistics for the candidate condition computed by the other computing device using values of local training instances assigned to the other computing device,computing, by the computing device, a weight for the candidate condition according to the computed statistics received from the one or more other computing devices for the candidate condition;

    determining, by the computing device, that a new rule comprising the candidate condition and the computed weight should be added to the ranking model, andin response, adding the new rule to the ranking model and providing, by the computing device, to each other computing device of the plurality of computing devices, an indication that the new rule comprising the candidate condition and the computed weight should be added to the ranking model;

    receiving a search query submitted by a first user;

    obtaining a plurality of search results that satisfy the search query, wherein each search result identifies a respective document of a plurality of documents;

    determining one or more features of the first user and one or more features of the search query submitted by the first user;

    using the one or more features of the first user and the one or more features of the search query as input to the ranking model to compute, for each document identified by the search results, a respective likelihood that the first user will select the document when provided in response to the search query; and

    ranking the plurality of search results based on a respective computed likelihood for each document, the computed likelihood for each document being a likelihood that the first user will select the document when provided in response to the search query.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×