Buffering data in a hierarchical data storage environment
First Claim
1. A system for buffering first requested data in a hierarchical data storage system including memory, a primary storage device, and a secondary storage device, the secondary storage device including a first logical data unit and a second logical unit, the system comprising:
- a pool of data buffers allocated in memory including a first data buffer;
a pool of buffer headers allocated in memory including a first buffer header, each buffer header being associated with one of the data buffers and including a search key field;
a request processing module receiving a first no recall data request associated with a first file-based identifier to the first requested data;
a retrieval module retrieving the first requested data from the first logical data unit of the secondary storage device;
a selection module selecting the first buffer header being associated with the first data buffer based on a relative availability status of each data buffer and loading the first file-based identifier in the search key field of the first buffer header;
a buffer management module manipulating the buffer headers based on values in the search key fields;
a linked data structure organizing the buffer headers based on a value in the search key field of each buffer header, the linked data structure comprising hash queue structures organized by a hashing algorithm hashing on the value in the search key field of each buffer header, and a hash queue header linked to each hash queue structure including a lock field to coordinate access to buffer headers in each hash queue structure.
2 Assignments
0 Petitions
Accused Products
Abstract
A system, a method, and program products for buffering data from a file in a hierarchical data storage system allocates data buffers and buffer management structures'"'"' in memory to optimize performance of no recall requests. Buffer management structures, such as buffer headers and hash queue headers, are used to optimize performance of insert, search, and data buffer reuse operations. Buffer headers are managed in a least-recently-used queue in accordance with a relative availability status. Buffer headers are also organized in hash queue structures in accordance with file-based identifiers to facilitate searching for requested data in the buffers. Data buffers can be used to buffer different data blocks within the same file and can be recycled to buffer data from other data blocks and other files from the secondary storage device. Data in a data block may be reread by the requesting process or by other processes as long as the requested data remains valid. Lock fields are used to coordinate multi-thread and multi-user accesses.
123 Citations
22 Claims
-
1. A system for buffering first requested data in a hierarchical data storage system including memory, a primary storage device, and a secondary storage device, the secondary storage device including a first logical data unit and a second logical unit, the system comprising:
-
a pool of data buffers allocated in memory including a first data buffer;
a pool of buffer headers allocated in memory including a first buffer header, each buffer header being associated with one of the data buffers and including a search key field;
a request processing module receiving a first no recall data request associated with a first file-based identifier to the first requested data;
a retrieval module retrieving the first requested data from the first logical data unit of the secondary storage device;
a selection module selecting the first buffer header being associated with the first data buffer based on a relative availability status of each data buffer and loading the first file-based identifier in the search key field of the first buffer header;
a buffer management module manipulating the buffer headers based on values in the search key fields;
a linked data structure organizing the buffer headers based on a value in the search key field of each buffer header, the linked data structure comprising hash queue structures organized by a hashing algorithm hashing on the value in the search key field of each buffer header, and a hash queue header linked to each hash queue structure including a lock field to coordinate access to buffer headers in each hash queue structure. - View Dependent Claims (2, 3, 10, 11, 12, 15)
a least-recently-used queue organizing the buffer headers based on the relative availability status the data buffer associated with each buffer header.
-
-
3. The system of claim 1 wherein each buffer header includes a lock field coordinating access to the associated data buffer.
-
10. The method of claim 1 further comprising:
-
receiving a second no recall data request associated with the first file-based identifier to second requested data located in the first logical data unit;
locating the first data buffer based on the first file-based identifier; and
returning the second requested data from the first data buffer, responsive to the second no recall data request.
-
-
11. The method of claim 10 wherein the second requested data is located in the first file of the secondary storage device.
-
12. The method of claim 10 wherein the second requested data is located in a second file of The secondary storage device.
-
15. The method of claim 1 wherein the selecting operation comprises:
-
allocating buffer headers in the memory;
storing in each buffer header a pointer to one of the data buffers;
organizing the buffer headers in a least-recently-used queue, wherein the buffer headers are ordered according the relative availability status of each data buffer, and selecting the first data buffer positioned at an end of the least-recently-used queue.
-
-
4. A method of buffering first requested data from a first file in a hierarchical data storage system including memory, a primary storage device, and a secondary storage device, the secondary storage device including a first logical data unit having a logical unit size, the method comprising:
-
allocating data buffers in the memory including a first data buffer;
receiving a first no recall data request associated with a first file based identifier to the first requested data, if the first requested data is not stored on the primary storage device;
retrieving the first requested data from the first logical data unit of the secondary storage device, responsive to the fist no recall data request;
selecting the first data buffer based on a relative availability status of the first data buffer;
storing the first logical data unit in the first data buffer, associating the first data buffer with a first buffer header, storing the first file-based identifier into the first buffer header to associate the first data buffer with the first file-based identifier; and
inserting the fist buffer header into one of a plurality of hash queues based on the first file-based identifier to organize the first data buffer among the data buffers based on the first file-based identifier. - View Dependent Claims (5, 6, 7, 8, 9, 13, 14, 16)
sizing each data buffer to equal substantially the logical unit size of the first logical data unit of the secondary storage device; and
aligning each data buffer in memory to maximize a data transfer rate from the secondary storage device.
-
-
6. The method of claim 4 wherein the relative availability status of the first data buffer is “
- least-recently-used”
, prior to the storing operation.
- least-recently-used”
-
7. The method of claim 4 further comprising:
changing the relative availability status of the first data buffer to “
most-recently-used”
when the first logical data unit is stored therein.
-
8. The method of claim 4 wherein the operation of inserting the first buffer header comprises:
-
providing a hash table data structure associated with a plurality of hash queue headers, each of the hash queue headers configured to indicate a linked data structure being one of the hash queues;
hashing on the first file-based identifier to select one of the hash queue headers; and
inserting the first buffer header in the linked data structure indicated by the selected hash queue header, based on the first file-based identifier.
-
-
9. The method of claim 4 further comprising:
inserting the first buffer header in a least-recently-used list, responsive to the operation of storing the first logical data unit in the first data buffer.
-
13. The method of claim 4 further comprising:
-
receiving a second no recall data request associated with a second file-based identifier to second requested data, if the second requested data is not stored on the primary storage device;
retrieving the second requested data from a second logical data unit of the secondary storage device;
storing the second logical data unit in a second data buffer; and
organizing the second data buffer relative to the first data buffer in accordance with the second file-based identifier.
-
-
14. The method of claim 4 further comprising:
-
receiving a second no recall data request associated with a second file-based identifier to second requested data, if the second requested data is not stored on the primary storage device;
retrieving the second requested data from the second logical data unit of the secondary storage device, responsive to the second no recall data request;
re-selecting the first data buffer based on the relative availability status;
storing the second logical data unit in the first data buffer;
associating the first data buffer with the second file-based identifier; and
re-organizing the first data buffer among the data buffers based on the second file-based identifier.
-
-
16. The method of claim 4 further comprising:
-
determining whether the first requested data in the fist data buffer is stale; and
refreshing the first requested data stored in the first data buffer from the secondary storage device, if the first requested data in the first data buffer is stale.
-
-
17. A computer-readable medium having computer-executable instructions for performing a computer process comprising:
-
allocating data buffers in the memory including a first data buffer;
receiving a first no recall data request associated with a first file-based identifier to the first requested data, if the first requested data is not stored on the primary storage device;
retrieving the first requested data from the first logical data unit of the secondary storage device, responsive to the first no recall data request;
selecting the first data buffer based on a relative availability status of the first data buffer;
storing the first logical data unit in the first data buffer;
associating the first data buffer with a first buffer header;
storing the first file-based identifier into the first buffer header to associate the first data buffer with the first file-based identifier; and
inserting the first buffer header into one of a plurality of hash queues based on the first file-based identifier to organize the first data buffer among the data buffers based on the first file-based identifier. - View Dependent Claims (18, 19)
receiving a second no recall data request associated with a second file-based identifier to second requested data, if the second requested data is not stored on the primary storage device;
retrieving the second requested data from the second logical data unit of the secondary storage device, responsive to the second no recall data request;
re-selecting the first data buffer based on the relative availability status;
storing the second logical data unit in the first data buffer, associating the first data buffer with the second file-based identifier; and
re-organizing the first data buffer among the data buffers based on the second file-based identifier.
-
-
19. The computer-readable medium of claim 17 wherein the computer process further comprises:
-
determining whether the first requested data in the first data buffer is stale; and
refreshing the first requested data stored in the first data buffer from the secondary storage device, if the first requested data in the first data buffer is stale.
-
-
20. A computer program storage medium readable by a computing system and encoding a computer program for executing a computer process buffering requested data from a file in a hierarchical data storage system including memory, a primary storage device, and a secondary storage device, the secondary storage device including a logical data unit having a logical unit size, the computer program comprising instructions for:
-
retrieving the requested data from the logical data unit of the secondary storage device, responsive to a no recall data request associated with a file-based identifier to the requested data;
selecting the data buffer based on a relative availability status of the data buffer;
storing the logical data unit in a data buffer allocated in memory;
associating the data buffer with a buffer header;
storing the file-based identifier into the buffer header to associate the data buffer with the file-based identifier; and
inserting the buffer header into one of a plurality of hash queues based on the file-based identifier to organize the data buffer among a plurality of data buffers based on the file-based identifier.
-
-
21. A computer data signal embodied in a carrier wave by a computing system and encoding a computer program for executing a computer process for buffering requested data from a file in a hierarchical data storage system including memory, a primary storage device, and a secondary storage device, the secondary storage device including a logical data unit having a logical unit size, the computer program comprising instructions for:
-
retrieving the requested data from the logical data unit of the secondary storage device, responsive to a no recall data request associated with a file-based identifier to the requested data;
storing the requested data from the logical data unit in a selected data buffer allocated in memory;
associating the selected data buffer with a buffer header;
storing the file-based identifier into the buffer header to associate the selected data buffer with the file-based identifier;
inserting the buffer header into one of a plurality of hash queues based on the first file-based identifier to organize the selected data buffer among one or more data buffers based on the file-based identifier; and
providing the requested data to service the no recall request. - View Dependent Claims (22)
determining whether the requested data in the selected data buffer is stale;
servicing the requested data from the secondary storage device; and
invalidating the selected data buffer if the requested data in the selected data buffer is stale.
-
Specification