Information retrieval system for archiving multiple document versions
First Claim
Patent Images
1. A method, performed by at least one computer system, of retrieving documents in response to a search query submitted on a first date that includes one or more phrases, the method comprising:
- selecting from a corpus of documents a plurality of documents that include at least one of the one or more phrases and that are relevant to the query based on a relevance score for the selected documents;
determining, using a processor of the at least one computer system, for at least one of the selected documents, a date range for the document, the date range representing a period for which no change in the document has been detected;
calculating, using a processor of the at least one computer system, a weighted relevance score for each of the selected documents, the weighted relevance score for a particular document being based on the relevance score for the particular document and a difference between the first date and the date range for the particular document; and
ranking, using a processor of the at least one computer system, the selected documents according to their weighted relevance scores.
2 Assignments
0 Petitions
Accused Products
Abstract
An information retrieval system uses phrases to index, retrieve, organize and describe documents. Phrases are identified that predict the presence of other phrases in documents. Documents are the indexed according to their included phrases. Index data for multiple versions or instances of documents is also maintained. Each document instance is associated with a date range and relevance data derived from the document for the date range.
-
Citations
23 Claims
-
1. A method, performed by at least one computer system, of retrieving documents in response to a search query submitted on a first date that includes one or more phrases, the method comprising:
-
selecting from a corpus of documents a plurality of documents that include at least one of the one or more phrases and that are relevant to the query based on a relevance score for the selected documents; determining, using a processor of the at least one computer system, for at least one of the selected documents, a date range for the document, the date range representing a period for which no change in the document has been detected; calculating, using a processor of the at least one computer system, a weighted relevance score for each of the selected documents, the weighted relevance score for a particular document being based on the relevance score for the particular document and a difference between the first date and the date range for the particular document; and ranking, using a processor of the at least one computer system, the selected documents according to their weighted relevance scores. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method performed by at least one computer system, of retrieving documents in response to a search query that includes one or more phrases, the method comprising:
-
selecting from a corpus of documents a plurality of documents that include at least one of the one or more phrases and that are relevant to the query based on relevance scores for the selected documents; determining, using a processor of the at least one computer system, for each selected document a date range with which the document is associated, the date range representing a period for which no change in the document has been detected; grouping the documents into a plurality of groups using to the date range associated with each document; and within each group, ranking the documents according to relevance scores determined based on relevance data valid for the date range associated with each document. - View Dependent Claims (10)
-
-
11. A method, performed by at least one computer system, of retrieving documents in response to a search query that includes one or more phrases, the method comprising:
-
accessing an index for a corpus of documents, the corpus comprising documents having multiple instances, each instance of a particular document being acquired at a different date and a new instance being acquired when a change from a previous instance is detected, wherein each instance has an associated date range; selecting from the index a plurality of documents that are relevant to the search query; determining, using a processor of the at least one computer system, for each instance of a selected document, a relevance score based on relevance data valid for the instance of the selected document and its associated date range; selecting, for each document, the document instance with the highest relevance score; and ranking the selected document instances according to their relevance scores. - View Dependent Claims (12, 13, 14, 15, 16)
-
-
17. A computer system comprising:
-
at least one processor; and a memory storing instructions that, when executed by the at least one processor, cause the computer system to perform operations comprising; selecting from a corpus of documents a plurality of documents that include at least one phrase of a search query received by the computer system and that are relevant to the search query, the selection being based on a relevance score for the selected documents; determining for at least one of the selected documents a date range for the document, the date range representing a period for which no change in the document has been detected; calculating a weighted relevance score for each of the selected documents, the weighted relevance score for a particular document being based on the relevance score for the particular document and a difference between a date the query was submitted and the date range for the particular document; and ranking the selected documents according to their weighted relevance scores. - View Dependent Claims (18, 19, 20)
-
-
21. A computer system comprising:
-
at least one processor; and a memory storing instructions that, when executed by the at least one processor, cause the computer system to perform operations comprising; selecting from a corpus of documents a plurality of documents that include at least one phrase of a search query and that are relevant to the query based on relevance scores for the selected documents; determine for each selected document a date range with which the document is associated, the date range representing a period for which no change in the document has been detected; grouping the documents into a plurality of groups using the date range associated with each document; and within each group, ranking the documents according to relevance scores determined based on relevance data valid for the date range associated with each document.
-
-
22. A computer system comprising:
-
at least one processor; and a memory storing instructions that, when executed by the at least one processor, cause the computer system to perform operations comprising; accessing an index for a corpus of documents, the corpus comprising documents having multiple instances, each instance of a particular document being acquired at a different date and a new instance being acquired when a change from a previous instance is detected, wherein each instance has an associated date range; selecting from the index a plurality of documents that are relevant to a search query; determining for each instance of a selected document, a relevance score based on relevance data valid for the instance of the selected document and its associated date range; selecting for each document the document instance with the highest relevance score; and ranking the selected document instances according to their relevance scores.
-
-
23. A computer-readable storage device having embodied thereon instructions that, when executed by one or more processors, cause at least one computer system to perform operations comprising:
-
selecting, from a corpus of documents, a plurality of documents that are responsive to a search query received by the computer system, the selection being based on a relevance score for the selected documents; determining for at least one of the selected documents a date range for the document, the date range representing a period for which no change in the document has been detected; calculating a weighted relevance score for each of the selected documents, the weighted relevance score for a particular document being based on the relevance score for the particular document and a difference between a date the query was submitted and the date range for the particular document; and ranking the selected documents according to their weighted relevance scores.
-
Specification