Mechanism for determining read-ahead length in a storage system
First Claim
Patent Images
1. A method for managing an amount of read-ahead data, comprising:
- monitoring client read operations, a client read operation requesting data blocks representing a read stream;
populating a data structure with read stream sizes and read stream counts based upon the client read operations;
evaluating the data structure to select a first read stream having a first read stream size;
evaluating the data structure to identify a set of read streams, where read streams within the set of read streams have read stream sizes equal to or greater than the first read stream size;
generating a stream metric corresponding to the read stream sizes and a count of the read streams within the set of read streams;
determining an expected size of a target read stream based upon the stream metric, the determining comprising;
determining a probability that the first read stream corresponds to the expected size; and
adjusting the expected size, comprising;
responsive to the probability exceeding a first threshold, decreasing the expected size; and
responsive to the probability exceeding a second threshold, increasing the expected size; and
reading ahead a number of data blocks corresponding to a difference between the first read stream size and the expected size after the adjusting.
1 Assignment
0 Petitions
Accused Products
Abstract
A storage system tracks statistical behavior of client read requests directed to a storage device to form prediction about data that the client will require next. The storage system collects the size of read sequences for various streams into a data structure, which summarizes past behavior of read requests. This data structure reports the number of streams in each equivalence class of stream sizes that is tracked. The data structure is then used to determine expected size of a selected read stream. The data structure is also used to improve predictions about an expected size computed by a known technique.
-
Citations
18 Claims
-
1. A method for managing an amount of read-ahead data, comprising:
-
monitoring client read operations, a client read operation requesting data blocks representing a read stream; populating a data structure with read stream sizes and read stream counts based upon the client read operations; evaluating the data structure to select a first read stream having a first read stream size; evaluating the data structure to identify a set of read streams, where read streams within the set of read streams have read stream sizes equal to or greater than the first read stream size; generating a stream metric corresponding to the read stream sizes and a count of the read streams within the set of read streams; determining an expected size of a target read stream based upon the stream metric, the determining comprising; determining a probability that the first read stream corresponds to the expected size; and adjusting the expected size, comprising; responsive to the probability exceeding a first threshold, decreasing the expected size; and responsive to the probability exceeding a second threshold, increasing the expected size; and reading ahead a number of data blocks corresponding to a difference between the first read stream size and the expected size after the adjusting. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system for managing an amount of read-ahead data, comprising:
-
one or more processors; and memory comprising instructions that when executed by at least one of the one or more processors implement at least some of; a read streams monitoring engine configured to; monitor client read operations, a client read operation requesting data blocks representing a read stream; and populate a data structure with read stream sizes and read stream counts based upon the client read operations; a read-ahead length computing engine configured to; evaluate the data structure to select a first read stream having a first read stream size; evaluate the data structure to identify a set of read streams, where read streams within the set of read streams have read stream sizes equal to or greater than the first read stream size; generate a stream metric corresponding to the read stream sizes and a count of the read streams within the set of read streams; and determine an expected size of a target read stream based upon the stream metric, the determining comprising; determining a probability that the first read stream corresponds to the expected size; and adjusting the expected size, comprising; responsive to the probability exceeding a first threshold, decreasing the expected size; and responsive to the probability exceeding a second threshold, increasing the expected size; and a read-ahead engine configured to; read ahead a number of data blocks corresponding to a difference between the first read stream size and the expected size after the adjusting. - View Dependent Claims (11, 12)
-
-
13. A computer-readable medium comprising instructions which when executed at least in part via a processor perform a method for managing an amount of read-ahead data, comprising:
-
monitoring client read operations, a client read operation requesting data blocks representing a read stream; populating a data structure with read stream sizes and read stream counts based upon the client read operations; evaluating the data structure to select a first read stream having a first read stream size; evaluating the data structure to identify a set of read streams, where read streams within the set of read streams have read stream sizes equal to or greater than the first read stream size; generating a stream metric corresponding to the read stream sizes and a count of the read streams within the set of read streams; determining an expected size of a target read stream based upon the stream metric, the determining comprising; determining a probability that the first read stream corresponds to the expected size; and adjusting the expected size, comprising; responsive to the probability exceeding a first threshold, decreasing the expected size; and responsive to the probability exceeding a second threshold, increasing the expected size; and reading ahead a number of data blocks corresponding to a difference between the first read stream size and the expected size after the adjusting. - View Dependent Claims (14, 15, 16, 17)
-
-
18. A method for managing an amount of read-ahead data, comprising:
-
monitoring client read operations, a client read operation requesting data blocks representing a read stream; populating a data structure with read stream sizes and read stream counts based upon the client read operations; evaluating the data structure to select a first read stream having a first read stream size; evaluating the data structure to identify a set of read streams, where read streams within the set of read streams have read stream sizes equal to or greater than the first read stream size; generating a stream metric corresponding to the read stream sizes and a count of the read streams within the set of read streams; determining an expected size of a target read stream based upon the stream metric, the determining comprising; determining a probability that the first read stream corresponds to the expected size; and adjusting the expected size, comprising; responsive to the probability exceeding a first threshold indicative of the first read stream size being less than the expected size, decreasing the expected size; and responsive to the probability exceeding a second threshold indicative of the first read stream size being greater than the expected size, increasing the expected size; and reading ahead a number of data blocks corresponding to a difference between the first read stream size and the expected size after the adjusting.
-
Specification