Determining relevance of a document to a query based on spans of query terms
First Claim
Patent Images
1. A method in a computer system with a processor and memory for determining relevance of a query to a document, the query having query terms, the method comprising:
- receiving from a user the query;
for each of a plurality of documents,identifying spans of query terms within the document with no repeated occurrences of a query term, each span including multiple query terms and having a span width, wherein at least one span includes at least three query terms, wherein a distance between adjacent query terms within a span is less than a threshold distance, and wherein the distance between two query terms is based on the number of non-query terms between the two query terms;
for each identified span, calculating a span relevance based on the number of query terms in the span and the inverse of the width of the span;
for each query term, aggregating the calculated span relevances for each span that contains the query term into a query term relevance of the query term to the document, wherein the aggregating includes summing the calculated span relevance for each span that contains the query term; and
aggregating the query term relevances for the query terms into a document relevance indicating relevance of the document to the query; and
displaying an indication of the documents in an order based on the relevance of the documents to the received query.
2 Assignments
0 Petitions
Accused Products
Abstract
A relevance system determines the relevance of a query term to a document based on spans within the document that contain the query term. The relevance system aggregates the relevance of the query terms into an overall relevance for the document. For each query term, the relevance system calculates a span relevance for each span that contains that query term. The relevance system then aggregates the span relevances for a query term into a query term relevance for that document. The relevance system may aggregate the query term relevances into a document relevance.
21 Citations
13 Claims
-
1. A method in a computer system with a processor and memory for determining relevance of a query to a document, the query having query terms, the method comprising:
-
receiving from a user the query; for each of a plurality of documents, identifying spans of query terms within the document with no repeated occurrences of a query term, each span including multiple query terms and having a span width, wherein at least one span includes at least three query terms, wherein a distance between adjacent query terms within a span is less than a threshold distance, and wherein the distance between two query terms is based on the number of non-query terms between the two query terms; for each identified span, calculating a span relevance based on the number of query terms in the span and the inverse of the width of the span; for each query term, aggregating the calculated span relevances for each span that contains the query term into a query term relevance of the query term to the document, wherein the aggregating includes summing the calculated span relevance for each span that contains the query term; and aggregating the query term relevances for the query terms into a document relevance indicating relevance of the document to the query; and displaying an indication of the documents in an order based on the relevance of the documents to the received query. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method in a computer system with a processor and memory for calculating relevance of a document to a query of query terms, the method comprising:
-
receiving from a user the query; for each of a plurality of documents, identifying spans of query terms within the document, wherein a span includes no repeated occurrences of a query term, wherein at least one span has at least three query terms, wherein each span includes multiple query terms and has a span width, wherein a distance between adjacent query terms within a span is less than a threshold distance, and wherein the distance between two query terms is based on the number of non-query terms between the two query terms; calculating a relevance contribution of each query term based on the identified spans that contain that query term, the relevance contributing being based on the number of query terms in the identified span and the inverse of the width of the identified span; and determining relevance of the document to the query based on the calculated relevance contributions without using term frequency by replacing a term frequency with the query term relevance in an equation for calculating relevance of a document to a query; and displaying an indication of the documents in an order based on the relevance of the documents to the received query. - View Dependent Claims (8, 9, 10)
-
-
11. A computer-readable storage medium containing instructions for controlling a computer system to identify spans of query terms of a query within a document for use in determining relevance of documents to the query, by a method comprising:
-
receiving from a user a query of query terms; for each of a plurality of documents, identifying query terms within the document; identifying sequences of query terms wherein a sequence includes no repeated occurrences of a query term, wherein at least one sequence includes at least three query terms, wherein a distance between adjacent query terms within a sequence is less than a threshold distance, and wherein the distance between two query terms is based on the number of non-query terms between the two query terms wherein a sequence is a span of query terms; and for each of the identified query terms of the document, for each span that contains the query term, increasing relevance of the document to the received query by a span relevance of the query term that is based on the number of query terms within the span and an inverse of the span width, wherein the relevance of the document is based on replacing a term frequency with a query term relevance derived from the span relevance in an equation for calculating relevance of a document to a query; and displaying an indication of documents in an order based on the relevance of the documents to the received query. - View Dependent Claims (12, 13)
-
Specification