DETERMINING RELEVANCE OF A DOCUMENT TO A QUERY BASED ON SPANS OF QUERY TERMS
First Claim
Patent Images
1. A method in a computer system with a processor and a memory for determining relevance of documents to a query, the method comprising:
- receiving from a user a query having query terms;
for each of a plurality of documents,identifying spans of query terms within the document, each identified span including at least two query terms and having a span width indicating distance between the query terms of the span, wherein the distance between the query terms is based on number of terms between the query terms, and wherein a span has a span width that is less than a threshold span width;
for each identified span, calculating a span relevance based on the span width of the span;
for each query term, calculating a query term relevance for the query term based on the span relevance for each span that includes the query term; and
aggregating the calculated query term relevances for the query term into a document relevance indicating relevance of the document to the received query; and
displaying an indication of the documents in an order based on the document relevances of the documents to the received query.
1 Assignment
0 Petitions
Accused Products
Abstract
A relevance system determines the relevance of a query term to a document based on spans within the document that contain the query term. The relevance system aggregates the relevance of the query terms into an overall relevance for the document. For each query term, the relevance system calculates a span relevance for each span that contains that query term. The relevance system then aggregates the span relevances for a query term into a query term relevance for that document. The relevance system may aggregate the query term relevances into a document relevance.
-
Citations
20 Claims
-
1. A method in a computer system with a processor and a memory for determining relevance of documents to a query, the method comprising:
-
receiving from a user a query having query terms; for each of a plurality of documents, identifying spans of query terms within the document, each identified span including at least two query terms and having a span width indicating distance between the query terms of the span, wherein the distance between the query terms is based on number of terms between the query terms, and wherein a span has a span width that is less than a threshold span width; for each identified span, calculating a span relevance based on the span width of the span; for each query term, calculating a query term relevance for the query term based on the span relevance for each span that includes the query term; and aggregating the calculated query term relevances for the query term into a document relevance indicating relevance of the document to the received query; and displaying an indication of the documents in an order based on the document relevances of the documents to the received query. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer-readable storage medium containing instructions for controlling a computer system to identify spans of query terms of a query within a document of terms for use in determining relevance of the document to the query, by a method comprising:
-
identifying query terms within the document; identifying sequences of query terms within the document wherein a sequence includes at least two query terms such that a distance between the query terms is less than a threshold distance, the distance between the query terms is based on number of terms between the query terms and wherein a sequence is a span of query terms; and calculating relevance of the document to the query based on the identified sequences of query terms. - View Dependent Claims (11, 12, 13, 14)
-
-
15. A method in a computer system with a processor and a memory for calculating relevance of a document of terms to a query of query terms, the method comprising:
-
identifying spans of query terms within the document, a span comprising terms of the document that includes multiple query terms of the query and having a span width that is based on number of non-query terms of the span; calculating a relevance contribution of each query term based on the identified spans that contain the query term; determining relevance of the document to the query based on the calculated relevance contributions without using term frequency; and outputting an indication of the determined relevance of the document to the query. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification