Information retrieval system for archiving multiple document versions
First Claim
Patent Images
1. A method, performed by at least one computer system, of retrieving documents in response to a search query that includes a phrase and a first date, the method comprising:
- selecting, from a corpus of documents, a plurality of documents that are relevant to the query;
determining, using a processor of the at least one computer system, for at least one of the plurality of documents, a date range for the document, the date range representing a period for which no change in the document has been detected;
calculating, using a processor of the at least one computer system, a weighted relevance score for each of the plurality of documents, the weighted relevance score for a particular document being a relevance score for the particular document adjusted by a difference between the first date and the date range for the particular document when the particular document has a date range; and
ranking, using a processor of the at least one computer system, the plurality of documents according to their weighted relevance scores.
2 Assignments
0 Petitions
Accused Products
Abstract
An information retrieval system uses phrases to index, retrieve, organize and describe documents. Phrases are identified that predict the presence of other phrases in documents. Documents ate the indexed according to their included phrases. Index data for multiple versions or instances of documents is also maintained. Each document instance is associated with a date range and relevance data derived from the document for the date range.
-
Citations
20 Claims
-
1. A method, performed by at least one computer system, of retrieving documents in response to a search query that includes a phrase and a first date, the method comprising:
-
selecting, from a corpus of documents, a plurality of documents that are relevant to the query; determining, using a processor of the at least one computer system, for at least one of the plurality of documents, a date range for the document, the date range representing a period for which no change in the document has been detected; calculating, using a processor of the at least one computer system, a weighted relevance score for each of the plurality of documents, the weighted relevance score for a particular document being a relevance score for the particular document adjusted by a difference between the first date and the date range for the particular document when the particular document has a date range; and ranking, using a processor of the at least one computer system, the plurality of documents according to their weighted relevance scores. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method, performed by at least one computer system, of retrieving documents in response to a search query that includes a phrase and a first date, the method comprising:
-
accessing an index for a corpus of documents, the corpus comprising documents having multiple instances, each instance of a particular document being acquired at a different date and a new instance being acquired when a change from a previous instance is detected, wherein each instance has an associated date range; selecting, using the index, a plurality of documents that are relevant to the phrase; for each document instance of the plurality of documents; determining, using a processor of the at least one computer system, a difference between the first date and the date range for the instance, and down-weighting a relevance score for the document instance in proportion to the difference; and ranking the document instances according to their respective weighted relevance score. - View Dependent Claims (11, 12, 13, 14, 15)
-
-
16. A computer system comprising:
-
at least one processor; and a memory storing instructions that, when executed by the at least one processor, cause the computer system to perform operations comprising; selecting, from a corpus of documents, a plurality of documents that are relevant to a search query received by the computer system, the query including a first date, determining for at least one of the selected documents a date range for the document, the date range representing a period for which no change in the document has been detected, calculating a respective weighted relevance score for each of the plurality of documents, the weighted relevance score for the at least one selected document being based on the relevance score for the at least one selected document and a difference between the first date and the date range for the at least one selected document, and ranking the plurality of documents according to their respective weighted relevance scores. - View Dependent Claims (17, 18, 19, 20)
-
Specification