MATCH FIX-UP TO REMOVE MATCHING DOCUMENTS
First Claim
1. A computer-implemented method, carried out by at least one server having one or more processors, the method comprising:
- receiving a plurality of documents found to be relevant to at least a portion of a search query, wherein the plurality of documents includes one or more invalid matching documents;
accessing a representation for each document of the plurality of documents, wherein the representation includes each term present within each document;
comparing the terms present within each document to one or more terms associated with the search query;
determining that the one or more invalid matching documents do not include the one or more terms associated with the search query; and
upon determining that the one or more invalid matching documents do not include the one or more terms associated with the search query, removing the one or more invalid matching documents from the plurality of documents found to be relevant to the at least a portion of the search query.
1 Assignment
0 Petitions
Accused Products
Abstract
The technology described herein provides for a match fix-up stage that removes matching documents identified for a search query that don'"'"'t actually contain terms from the search query. A representation of each document (e.g., a forward index storing a list of terms for each document) is used to identify valid matching documents (i.e., documents containing terms from the search query) and invalid matching documents (i.e., documents that don'"'"'t contain terms from the search query). Any invalid matching documents are removed from further processing and ranking for the search query.
-
Citations
20 Claims
-
1. A computer-implemented method, carried out by at least one server having one or more processors, the method comprising:
-
receiving a plurality of documents found to be relevant to at least a portion of a search query, wherein the plurality of documents includes one or more invalid matching documents; accessing a representation for each document of the plurality of documents, wherein the representation includes each term present within each document; comparing the terms present within each document to one or more terms associated with the search query; determining that the one or more invalid matching documents do not include the one or more terms associated with the search query; and upon determining that the one or more invalid matching documents do not include the one or more terms associated with the search query, removing the one or more invalid matching documents from the plurality of documents found to be relevant to the at least a portion of the search query. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. One or more computer storage media storing computer-useable instructions that, when used by one or more computing devices, cause the one or more computing devices to perform a method, the method comprising:
-
receiving a first plurality of documents found to be relevant to at least a portion of a search query, wherein the first plurality of documents includes one or more invalid matching documents; receiving a forward index for each document of the first plurality of documents, wherein the forward index includes one or more terms included in each document; using the forward index for each document of the first plurality of documents, identifying one or more valid matching documents that include one or more terms associated with the search query; using the forward index for each document of the first plurality of documents, identifying one or more invalid matching documents that do not include the one or more terms associated with the search query; removing the one or more invalid matching documents from the first plurality of documents to create a filtered set of one or more documents found to be relevant to the at least a portion of the search query; and communicating the filtered set of one or more documents found to be relevant to the at least a portion of the search query for ranking each document of the filtered set of one or more documents for the search query. - View Dependent Claims (10, 11, 12, 13, 14)
-
-
15. A computerized system embodied on one or more computer storage media having computer-executable instructions provided thereon, the system comprising:
-
a preliminary ranker component to rank a first set of documents that are found to be relevant to at least a portion of a search query by a matcher component, wherein the initial set of documents includes one or more invalid matching documents; a match fix-up component to identify when the first set of documents includes one or more invalid matching documents utilizing a forward index for each document of the initial set of documents; and a subsequent ranker to rank a second set of documents received from the match fix-up component, wherein the second set of documents includes fewer invalid matching documents that the first set of documents. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification