Field weighting in text searching
First Claim
Patent Images
1. A computer-implemented method of determining a field-weighted score for a document having multiple fields relative to a query having a plurality of query terms, the computer-implemented method comprising:
- determining fields of the document, wherein each field includes a contextual section of the document based on the document structure;
determining a field weight for each of the determined fields, wherein the field weight corresponds to a number of times for replicating the content of each of the determined fields;
replicating the content of each of the determined fields the number of times indicated by the field weight for each of the determined fields, wherein the replicated content of each field is concatenated into a field set for each of the determined fields;
combining each concatenated field set for each field of the document to generate a virtual document including each concatenated field set for each field of the document;
indexing the virtual document to produce virtual document statistics; and
causing a processor of a computing device to compute the field-weighted score from the virtual document index based on the query.
3 Assignments
0 Petitions
Accused Products
Abstract
A field-weighted search combines statistical information for each term across document fields in a suitably weighted fashion. Both field-specific term frequencies and field and document lengths are considered to obtain a field-weighted document weight for each query term. Each field-weighted document weight can then be combined in order to generate a field-weighted document score that is responsive to the overall query.
216 Citations
21 Claims
-
1. A computer-implemented method of determining a field-weighted score for a document having multiple fields relative to a query having a plurality of query terms, the computer-implemented method comprising:
-
determining fields of the document, wherein each field includes a contextual section of the document based on the document structure; determining a field weight for each of the determined fields, wherein the field weight corresponds to a number of times for replicating the content of each of the determined fields; replicating the content of each of the determined fields the number of times indicated by the field weight for each of the determined fields, wherein the replicated content of each field is concatenated into a field set for each of the determined fields; combining each concatenated field set for each field of the document to generate a virtual document including each concatenated field set for each field of the document; indexing the virtual document to produce virtual document statistics; and causing a processor of a computing device to compute the field-weighted score from the virtual document index based on the query. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer-readable storage medium having computer executable instructions for determining a field-weighted score for a document having multiple fields relative to a query having a plurality of query terms, the instructions comprising:
-
determining fields of the document, wherein each field includes a contextual section of the document based on the document structure; determining a field weight for each of the determined fields, wherein the field weight corresponds to a number of times for replicating the content of each of the determined fields; replicating the content of each of the determined fields the number of times indicated by the field weight for each of the determined fields, wherein the replicated content of each field is concatenated into a field set for each of the determined fields; combining each concatenated field set for each field of the document to generate a virtual document including each concatenated field set for each field of the document; indexing the virtual document to produce a virtual document statistics; and computing the field-weighted score from the virtual document index based on the query. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A system comprising:
-
a processor; and a memory having computer-executable instructions stored thereon, wherein the computer-executable instructions are configured for; determining fields of the document, wherein each field includes a contextual section of the document based on the document structure; determining a field weight for each of the determined fields, wherein the field weight corresponds to a number of times for replicating the content of each of the determined fields; replicating the content of each of the determined fields the number of times indicated by the field weight for each of the determined fields, wherein the replicated content of each field is concatenated into a field set for each of the determined fields; combining each concatenated field set for each field of the document to generate a virtual document including each concatenated field set for each field of the document; indexing the virtual document to produce a virtual document statistics; and computing the field-weighted score from the virtual document index based on the query. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
Specification