Highly scalable and distributed data de-duplication
First Claim
Patent Images
1. A method comprising:
- maintaining, in a data storage system, a plurality of blocks of data, the storage system representing a plurality of sets of digital data, by associating each of said sets of digital data with at least one of said plurality of blocks;
maintaining a first timestamp corresponding to each of the plurality of blocks, the first timestamp indicating a last time when a block was verified to have been associated with at least one of said sets of digital data;
maintaining a second timestamp corresponding to each of the sets of digital data, the second timestamp indicating a time when an association between a set of digital data and at least one of said plurality of blocks was verified;
providing an indication that a given block that is not associated with any of the sets of digital data is in the process of being removed from the storage system, wherein the first timestamp associated with the block indicates an earlier time than each of the second timestamps;
deleting the given block of data from the storage system; and
providing an indication that the block has been removed from the storage system.
5 Assignments
0 Petitions
Accused Products
Abstract
This disclosure relates to systems and methods for both maintaining referential integrity within a data storage system, and freeing unused storage in the system, without the need to maintain reference counts to the blocks of storage used to represent and store the data.
94 Citations
18 Claims
-
1. A method comprising:
-
maintaining, in a data storage system, a plurality of blocks of data, the storage system representing a plurality of sets of digital data, by associating each of said sets of digital data with at least one of said plurality of blocks; maintaining a first timestamp corresponding to each of the plurality of blocks, the first timestamp indicating a last time when a block was verified to have been associated with at least one of said sets of digital data; maintaining a second timestamp corresponding to each of the sets of digital data, the second timestamp indicating a time when an association between a set of digital data and at least one of said plurality of blocks was verified; providing an indication that a given block that is not associated with any of the sets of digital data is in the process of being removed from the storage system, wherein the first timestamp associated with the block indicates an earlier time than each of the second timestamps; deleting the given block of data from the storage system; and providing an indication that the block has been removed from the storage system. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A system comprising:
-
a memory configured to store data; and a processor configured to; maintain, in the memory, a plurality of blocks of data, the memory representing a plurality of sets of digital data, by associating each of said sets of digital data with at least one of said plurality of blocks; maintain a first timestamp corresponding to each of the plurality of blocks, the first timestamp indicating a time when a block was verified to have been associated with at least one of said sets of digital data; maintain a second timestamp corresponding to each of the sets of digital data, the second timestamp indicating a time when an association between a set of digital data and at least one of said plurality of blocks was verified; provide an indication that a given block that is not associated with any of the sets of digital data is in the process of being removed from a storage system, wherein the first timestamp associated with the block indicates an earlier time than each of the second timestamps; delete the given block of data from the storage system; and provide an indication that the block has been removed from the storage system. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A non-transitory computer readable storage medium storing computer readable instructions thereon, the computer readable instructions when executed by a processor of a computing device cause the processor to perform a method comprising:
-
maintaining, in a data storage system, a plurality of blocks of data, the storage system representing a plurality of sets of digital data, by associating each of said sets of digital data with at least one of said plurality of blocks; maintaining a first timestamp corresponding to each of the plurality of blocks, the first timestamp indicating a time when a block was verified to have been associated with at least one of said sets of digital data; maintaining a second timestamp corresponding to each of the sets of digital data, the second timestamp indicating a time when an association between a set of digital data and at least one of said plurality of blocks of data was verified; providing an indication that a given block that is not associated with any of the sets of digital data is in the process of being removed from the storage system, wherein the first timestamp associated with the block indicates an earlier time than each of the second timestamps; deleting the given block of data from the storage system; and providing an indication that the block has been removed from the storage system. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification