Multi-ranker for search
First Claim
1. A method comprising:
- processing, using a processor, a user query received to identify a set of documents relevant to a user query by comparing the parameters gathered from the user query with a database of documents;
receiving, using the processor, the set of documents for ranking,performing, using the processor, an initial ranking on the received documents, each of the documents assigned an initial ranking from a set of rankings,creating, using the processor, a plurality of document pairs from the received documents based on a vector of features that includes a frequency of a term in each document of the set of documents and a length of each document of the set of documents, each document pair including two documents having two different ranks, each rank pair comprising two different ranks of different integer values,generating, using the processor, at least two subsets of the plurality of document pairs, each subset corresponding to a different rank pair, each rank pair comprising two ranks of different ranking values corresponding to respective different ranks;
generating, using the processor, multiple hyperplanes comprising multiple base rankers for each rank pair, wherein each base ranker of the multiple base rankers is a single hyperplane including a linear ranking model trained for a particular rank pair using a ranking support vector machine that ranks the documents in each subset, wherein the generating creates a base ranker for each rank pair; and
ranking, using the processor, the document pairs within each of the at least two subsets using the base ranker associated with each rank pair; and
aggregating, using the processor, the multiple hyperplanes into an ensemble of base rankers comprising the base rankers for each rank pair, and producing, from the ensemble of base rankers, a ranking list that ranks the documents in the set of documents; and
creating, using the processor, base rankers only for adjacent rank pairs, each adjacent rank pair comprised of rankings of integer values that directly follow one another in sequence.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for processing user queries and identifying a set of documents relevant to the user query from a database using multi ranker search are described. In one implementation, the retrieved documents can be paired to form document pairs, or instance pairs, in a variety of combinations. Such instance pairs may have a rank order between them as they all have different ranks. A classifier, hyperplane, and a base ranker may be constructed for identifying the rank order relationships between the two instances in an instance pair. The base ranker may be generated for each rank pair. The systems use a divide and conquer strategy for learning to rank the instance pairs by employing multiple hyperplanes and aggregate the base rankers to form an ensemble of base rankers. Such an ensemble of base rankers can be used to rank the documents or instances.
47 Citations
12 Claims
-
1. A method comprising:
-
processing, using a processor, a user query received to identify a set of documents relevant to a user query by comparing the parameters gathered from the user query with a database of documents; receiving, using the processor, the set of documents for ranking, performing, using the processor, an initial ranking on the received documents, each of the documents assigned an initial ranking from a set of rankings, creating, using the processor, a plurality of document pairs from the received documents based on a vector of features that includes a frequency of a term in each document of the set of documents and a length of each document of the set of documents, each document pair including two documents having two different ranks, each rank pair comprising two different ranks of different integer values, generating, using the processor, at least two subsets of the plurality of document pairs, each subset corresponding to a different rank pair, each rank pair comprising two ranks of different ranking values corresponding to respective different ranks; generating, using the processor, multiple hyperplanes comprising multiple base rankers for each rank pair, wherein each base ranker of the multiple base rankers is a single hyperplane including a linear ranking model trained for a particular rank pair using a ranking support vector machine that ranks the documents in each subset, wherein the generating creates a base ranker for each rank pair; and ranking, using the processor, the document pairs within each of the at least two subsets using the base ranker associated with each rank pair; and aggregating, using the processor, the multiple hyperplanes into an ensemble of base rankers comprising the base rankers for each rank pair, and producing, from the ensemble of base rankers, a ranking list that ranks the documents in the set of documents; and creating, using the processor, base rankers only for adjacent rank pairs, each adjacent rank pair comprised of rankings of integer values that directly follow one another in sequence. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A computer-readable storage medium storing instructions that, when executed by a processor, perform operations comprising:
-
processing a user query received to identify a set of documents relevant to a user query by comparing the parameters gathered from the user query with a database of documents; receiving the set of documents for ranking, performing an initial ranking on the received documents, each of the documents assigned an initial ranking from a set of rankings, creating a plurality of document pairs from the received documents based on a vector of features that includes a frequency of a term in each document of the set of documents and a length of each document of the set of documents, each document pair including two documents having two different ranks, each rank pair comprising two different ranks of different integer values, generating at least two subsets of the plurality of document pairs, each subset corresponding to a different rank pair, each rank pair comprising two ranks of different ranking values corresponding to respective different ranks; generating multiple hyperplanes comprising multiple base rankers for each rank pair, wherein each base ranker of the multiple base rankers is a single hyperplane including a linear ranking model trained for a particular rank pair using a ranking support vector machine that ranks the documents in each subset, wherein the generating creates a base ranker for each rank pair; and ranking the document pairs within each of the at least two subsets using the base ranker associated with each rank pair; and aggregating the multiple hyperplanes into an ensemble of base rankers comprising the base rankers for each rank pair, and producing, from the ensemble of base rankers, a ranking list that ranks the documents in the set of documents; and creating base rankers only for adjacent rank pairs, each adjacent rank pair comprised of rankings of integer values that directly follow one another in sequence. - View Dependent Claims (8, 9)
-
-
10. A system comprising:
-
a processor; a memory connected to the processor, the memory comprising modules including; a query processing module executable on the processor, configured to process a user query received to identify a set of documents relevant to a user query by comparing the parameters gathered from the user query with a database of documents; a ranking module executable on the processor, configured to perform the following acts; receive the set of documents for ranking, perform an initial ranking on the received documents, each of the documents assigned an initial ranking from a set of rankings, create a plurality of document pairs from the received documents based on a vector of features that includes a frequency of a term in each document of the set of documents and a length of each document of the set of documents, each document pair including two documents having two different ranks, each rank pair comprising two different ranks of different integer values, generating at least two subsets of the plurality of document pairs, each subset corresponding to a different rank pair, each rank pair comprising two ranks of different ranking values corresponding to respective different ranks; generate multiple hyperplanes comprising multiple base rankers for each rank pair, wherein each base ranker of the multiple base rankers is a single hyperplane including a linear ranking model trained for a particular rank pair using a ranking support vector machine that ranks the documents in each subset, wherein the generating creates a base ranker for each rank pair; and rank the document pairs within each of the at least two subsets using the base ranker associated with each rank pair; and a rank aggregation module executable on the processor, configured to aggregate the multiple hyperplanes into an ensemble of base rankers comprising the base rankers for each rank pair, and further configured to produce, from the ensemble of base rankers, a ranking list that ranks the documents in the set of documents; and wherein the ranking module is further executable on the processor to create base rankers only for adjacent rank pairs, each adjacent rank pair comprised of rankings of integer values that directly follow one another in sequence. - View Dependent Claims (11, 12)
-
Specification