Partial recall of deduplicated files
First Claim
1. In a computing environment, a method performed at least in part on at least one processor, comprising, changing a state of a file of a storage volume from a fully deduplicated state to a partially deduplicated state, including reading from a chunk store one or more chunks of the file'"'"'s data, committing at least part of the one or more chunks to a stable storage as one or more recalled ranges of data of that file, and maintaining information in association with the file that tracks which range or ranges of data have been recalled, and which range or ranges of data reside in the chunk store.
4 Assignments
0 Petitions
Accused Products
Abstract
The subject disclosure is directed towards changing a file from a fully deduplicated state to a partially deduplicated state in which some of the file data is deduplicated in a chunk store, and some is recalled into the file, that is, in the file'"'"'s storage volume. A partial recall mechanism such as in a file system filter tracks (e.g., via a bitmap in a file reparse point) whether file data is maintained in the chunk store or has been recalled to the file. Data is recalled from the chunk store as needed, and committed (e.g., flushed) to the file. Also described is efficiently returning the file to a fully deduplicated state by using the tracking information to determine which parts of the file are already deduplicated into the chunk store so as to avoid their further deduplication processing.
-
Citations
20 Claims
- 1. In a computing environment, a method performed at least in part on at least one processor, comprising, changing a state of a file of a storage volume from a fully deduplicated state to a partially deduplicated state, including reading from a chunk store one or more chunks of the file'"'"'s data, committing at least part of the one or more chunks to a stable storage as one or more recalled ranges of data of that file, and maintaining information in association with the file that tracks which range or ranges of data have been recalled, and which range or ranges of data reside in the chunk store.
- 14. In a computing environment, a system comprising at least one processor, a memory communicatively coupled to the at least one processor and including components, comprising, a partial recall mechanism configured to access associated tracking data that indicates which range or ranges of a file have been recalled to the file, and which ranges reside on a chunk store and are referenced by the file, the partial recall mechanism further configured to access one or more chunks of file data in the chunk store, commit one or more ranges corresponding to at least part of the one or more chunks to the file as one or more recalled ranges, and update the tracking data to indicate when a range has been committed as a recalled range.
-
19. One or more computer-readable storage media having computer-executable instructions, which when executed perform steps, comprising:
- receiving a request to access a range of file data of a file at a given offset, in which at least some of the requested file data resides in a deduplication chunk store;
reading from the chunk store, and writing to storage, until at least the range of file data at the given offset is obtained;
committing at least part of the file data to the file as one or more partially recalled ranges;
updating tracking information that indicates that the one or more partially recalled ranges have been committed to the file; and
forwarding the request towards a file system. - View Dependent Claims (20)
- receiving a request to access a range of file data of a file at a given offset, in which at least some of the requested file data resides in a deduplication chunk store;
Specification