Multi-ranker for search

US 8,122,015 B2
Filed: 09/21/2007
Issued: 02/21/2012
Est. Priority Date: 09/21/2007
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

processing, using a processor, a user query received to identify a set of documents relevant to a user query by comparing the parameters gathered from the user query with a database of documents;

receiving, using the processor, the set of documents for ranking,performing, using the processor, an initial ranking on the received documents, each of the documents assigned an initial ranking from a set of rankings,creating, using the processor, a plurality of document pairs from the received documents based on a vector of features that includes a frequency of a term in each document of the set of documents and a length of each document of the set of documents, each document pair including two documents having two different ranks, each rank pair comprising two different ranks of different integer values,generating, using the processor, at least two subsets of the plurality of document pairs, each subset corresponding to a different rank pair, each rank pair comprising two ranks of different ranking values corresponding to respective different ranks;

generating, using the processor, multiple hyperplanes comprising multiple base rankers for each rank pair, wherein each base ranker of the multiple base rankers is a single hyperplane including a linear ranking model trained for a particular rank pair using a ranking support vector machine that ranks the documents in each subset, wherein the generating creates a base ranker for each rank pair; and

ranking, using the processor, the document pairs within each of the at least two subsets using the base ranker associated with each rank pair; and

aggregating, using the processor, the multiple hyperplanes into an ensemble of base rankers comprising the base rankers for each rank pair, and producing, from the ensemble of base rankers, a ranking list that ranks the documents in the set of documents; and

creating, using the processor, base rankers only for adjacent rank pairs, each adjacent rank pair comprised of rankings of integer values that directly follow one another in sequence.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems and methods for processing user queries and identifying a set of documents relevant to the user query from a database using multi ranker search are described. In one implementation, the retrieved documents can be paired to form document pairs, or instance pairs, in a variety of combinations. Such instance pairs may have a rank order between them as they all have different ranks. A classifier, hyperplane, and a base ranker may be constructed for identifying the rank order relationships between the two instances in an instance pair. The base ranker may be generated for each rank pair. The systems use a divide and conquer strategy for learning to rank the instance pairs by employing multiple hyperplanes and aggregate the base rankers to form an ensemble of base rankers. Such an ensemble of base rankers can be used to rank the documents or instances.

47 Citations

View as Search Results

12 Claims

1. A method comprising:
- processing, using a processor, a user query received to identify a set of documents relevant to a user query by comparing the parameters gathered from the user query with a database of documents;
  
  receiving, using the processor, the set of documents for ranking,performing, using the processor, an initial ranking on the received documents, each of the documents assigned an initial ranking from a set of rankings,creating, using the processor, a plurality of document pairs from the received documents based on a vector of features that includes a frequency of a term in each document of the set of documents and a length of each document of the set of documents, each document pair including two documents having two different ranks, each rank pair comprising two different ranks of different integer values,generating, using the processor, at least two subsets of the plurality of document pairs, each subset corresponding to a different rank pair, each rank pair comprising two ranks of different ranking values corresponding to respective different ranks;
  
  generating, using the processor, multiple hyperplanes comprising multiple base rankers for each rank pair, wherein each base ranker of the multiple base rankers is a single hyperplane including a linear ranking model trained for a particular rank pair using a ranking support vector machine that ranks the documents in each subset, wherein the generating creates a base ranker for each rank pair; and
  
  ranking, using the processor, the document pairs within each of the at least two subsets using the base ranker associated with each rank pair; and
  
  aggregating, using the processor, the multiple hyperplanes into an ensemble of base rankers comprising the base rankers for each rank pair, and producing, from the ensemble of base rankers, a ranking list that ranks the documents in the set of documents; and
  
  creating, using the processor, base rankers only for adjacent rank pairs, each adjacent rank pair comprised of rankings of integer values that directly follow one another in sequence.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1, wherein the set of documents includes documents ranked based on relevancy.
  - 3. The method of claim 2, further comprising, prior to the creating, ranking the documents based on a vector of features.
  - 4. The method of claim 1, wherein the aggregating includes examining the ranking list produced from the ensemble of the base rankers to assign scores in each instance based on a position on the ranking list.
  - 5. The method of claim 1 further comprising assigning weights to the base rankers in the ensemble of the base rankers, wherein the weights give importance to instances belonging to subsets corresponding to particular base rankers.
  - 6. The method of claim 1 performed for searching on the World Wide Web.

7. A computer-readable storage medium storing instructions that, when executed by a processor, perform operations comprising:
- processing a user query received to identify a set of documents relevant to a user query by comparing the parameters gathered from the user query with a database of documents;
  
  receiving the set of documents for ranking,performing an initial ranking on the received documents, each of the documents assigned an initial ranking from a set of rankings,creating a plurality of document pairs from the received documents based on a vector of features that includes a frequency of a term in each document of the set of documents and a length of each document of the set of documents, each document pair including two documents having two different ranks, each rank pair comprising two different ranks of different integer values,generating at least two subsets of the plurality of document pairs, each subset corresponding to a different rank pair, each rank pair comprising two ranks of different ranking values corresponding to respective different ranks;
  
  generating multiple hyperplanes comprising multiple base rankers for each rank pair, wherein each base ranker of the multiple base rankers is a single hyperplane including a linear ranking model trained for a particular rank pair using a ranking support vector machine that ranks the documents in each subset, wherein the generating creates a base ranker for each rank pair; and
  
  ranking the document pairs within each of the at least two subsets using the base ranker associated with each rank pair; and
  
  aggregating the multiple hyperplanes into an ensemble of base rankers comprising the base rankers for each rank pair, and producing, from the ensemble of base rankers, a ranking list that ranks the documents in the set of documents; and
  
  creating base rankers only for adjacent rank pairs, each adjacent rank pair comprised of rankings of integer values that directly follow one another in sequence.
- View Dependent Claims (8, 9)
- - 8. The computer-readable storage medium of claim 7, wherein the creating a plurality of document pairs is performed for all documents in a set of documents, until all the documents in the set are selected and paired.
  - 9. The computer-readable storage medium of claim 7, further including associating similar subsets with other rank pairs by grouping a document pair with other document pairs.

10. A system comprising:
- a processor;
  
  a memory connected to the processor, the memory comprising modules including;
  
  a query processing module executable on the processor, configured to process a user query received to identify a set of documents relevant to a user query by comparing the parameters gathered from the user query with a database of documents;
  
  a ranking module executable on the processor, configured to perform the following acts;
  
  receive the set of documents for ranking,perform an initial ranking on the received documents, each of the documents assigned an initial ranking from a set of rankings,create a plurality of document pairs from the received documents based on a vector of features that includes a frequency of a term in each document of the set of documents and a length of each document of the set of documents, each document pair including two documents having two different ranks, each rank pair comprising two different ranks of different integer values,generating at least two subsets of the plurality of document pairs, each subset corresponding to a different rank pair, each rank pair comprising two ranks of different ranking values corresponding to respective different ranks;
  
  generate multiple hyperplanes comprising multiple base rankers for each rank pair, wherein each base ranker of the multiple base rankers is a single hyperplane including a linear ranking model trained for a particular rank pair using a ranking support vector machine that ranks the documents in each subset, wherein the generating creates a base ranker for each rank pair; and
  
  rank the document pairs within each of the at least two subsets using the base ranker associated with each rank pair; and
  
  a rank aggregation module executable on the processor, configured to aggregate the multiple hyperplanes into an ensemble of base rankers comprising the base rankers for each rank pair, and further configured to produce, from the ensemble of base rankers, a ranking list that ranks the documents in the set of documents; and
  
  wherein the ranking module is further executable on the processor to create base rankers only for adjacent rank pairs, each adjacent rank pair comprised of rankings of integer values that directly follow one another in sequence.
- View Dependent Claims (11, 12)
- - 11. The system of claim 10, wherein the query processing module further groups the documents into several groups.
  - 12. The system of claim 10, wherein the rank aggregation module includes a scoring module executable on the processor to aggregate the subsets of the document pairs.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Liu, Tie-Yan, Tao, Qin, Li, Hang
Primary Examiner(s)
Lu, Charles

Application Number

US11/859,066
Publication Number

US 20090083248A1
Time in Patent Office

1,614 Days
Field of Search

707/3, 707/5, 707/723
US Class Current

707/723
CPC Class Codes

G06F 16/24578   using ranking

G06N 20/10   using kernel methods, e.g. ...

G06N 20/20   Ensemble learning

Multi-ranker for search

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

47 Citations

12 Claims

Specification

Solutions

Use Cases

Quick Links

Multi-ranker for search

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

47 Citations

12 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links