Log Structured Content Addressable Deduplicating Storage
19 Assignments
0 Petitions
Accused Products
Abstract
A log structured content addressable deduplicated data storage system may be used to store deduplicated data. Data to be stored is partitioned into data segments. Each unique data segment is associated with a label. The storage system maintains a transaction log. Mutating storage operations are initiated by storing transaction records in the transaction log. Additional transaction records are stored in the log when storage operations are completed. Upon restarting an embodiment of the data storage system, the transaction records from the transaction logs are replayed to recreate the state of the data storage system. The data storage system updates file system metadata with transaction information while a storage operation associated with the file is being processed. This transaction information serves as atomically updated transaction commit points, allowing fully internally consistent snapshots of deduplicated volumes to be taken at any time.
-
Citations
37 Claims
-
1-12. -12. (canceled)
-
13. A method of storing data in a data storage system, the method comprising:
-
identifying a storage label and a data segment associated with a storage operation; generating a first transaction record including an identifier associated with the storage operation; storing the first transaction record in a transaction log data structure; storing the data segment at a storage location in a data segment storage data structure; generating label metadata including the storage label and a reference to the storage location; storing the label metadata in a label cache; generating a second transaction record including the identifier, wherein the second transaction is adapted to indicate that the storage operation is complete; and storing the second transaction record in the transaction log data structure. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
-
-
24. A method of storing data in a data storage system, the method comprising:
-
identifying a storage label associated with a reference operation, wherein the storage label is associated with a previously stored data segment; generating a first transaction record including an identifier associated with the storage operation; storing the first transaction record in a transaction log data structure; searching a label cache to locate label metadata matching the storage label; in response to locating the label metadata matching the storage label in the label cache, changing a reference count included in the label metadata; in response to not locating the label metadata matching the storage label in the label cache, searching at least one label metadata archive to locate the label metadata matching the storage label; in response to locating the label metadata matching the storage label in the label metadata archive, changing the reference count included in the label metadata; and generating a second transaction record including the identifier, wherein the second transaction is adapted to indicate that the reference operation is complete; and storing the second transaction record in the transaction log data structure. - View Dependent Claims (25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37)
-
Specification