SYSTEM AND METHOD OF DATA CACHING FOR COMPLIANCE STORAGE SYSTEMS WITH KEYWORD QUERY BASED ACCESS
First Claim
1. A method of data caching for compliance and storage systems that provide keyword search query based access to documents, the method comprising:
- computing a value for each data document based on a document information-retrieval (IR) relevancy metric for user keyword queries and a recency, frequency of each query;
adapting said values to changing query frequencies and popularities; and
selecting and evicting documents from a cache based on said values according to a knapsack solution.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of data caching for compliance and storage systems that provide keyword search query based access to documents computes a value for each data document based on a document information-retrieval relevancy metric for user keyword queries and a recency, frequency of each query. The values are adapted to changing query frequencies and popularities. Then selecting and evicting documents from a cache can be based on the values according to a knapsack solution. A weight is computed for each query such that recent, more frequent queries get a higher weight. A information-retrieval metric is used for measuring a relevancy of a document for a query. A weighted sum is taken of the information-retrieval metric times a query weight over all queries.
28 Citations
20 Claims
-
1. A method of data caching for compliance and storage systems that provide keyword search query based access to documents, the method comprising:
-
computing a value for each data document based on a document information-retrieval (IR) relevancy metric for user keyword queries and a recency, frequency of each query; adapting said values to changing query frequencies and popularities; and selecting and evicting documents from a cache based on said values according to a knapsack solution. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A document search system, comprising:
-
a keyword based interface to search documents from a storage device; a cache for staging documents that are read and that are expected to be needed again from said storage device; a query history first-in first-out (FIFO) queue for maintaining a query history of recent queries from a user, wherein, each query is assigned a weight based on its position in the queue; a device for computing a value for each data document based on a document information-retrieval (IR) relevancy metric for user keyword queries and a recency, frequency of each query; a mechanism for adapting said values to changing query frequencies and popularities; and a mechanism selecting and evicting documents from the cache based on said values according to a knapsack solution. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A method for servicing a new query from a user, comprising:
-
updating a query history FIFO queue, query weights, and document values; checking to see if a document that satisfies a query is already in a cache; if yes, a document is returned to said user; otherwise, a new document is fetched from storage to be placed in said cache and provided to said user. - View Dependent Claims (18, 19, 20)
-
Specification