Block-level single instancing
First Claim
1. A computing system for reclaiming storage space on one or more storage devices having native file systems, wherein the storage space is utilized by one or more logical containers to store deduplicated blocks of data, wherein locations of the deduplicated blocks of data in the logical containers are not tracked by the native file systems of the storage devices, the computing system comprising:
- one or more storage devices storing on physical media—
one or more logical containers that include multiple deduplicated blocks of data that correspond to data objects; and
one or more data structures that indicate whether the blocks of data are referred to;
one or more databases storing information indicating whether the blocks of data are referred to; and
a secondary storage computing device programmed to—
receive an indication to remove a first set of blocks of data from a first logical container;
for each of the blocks of data in the first set—
determine, from the databases, whether the block of data is referred to; and
if the block of data is not referred to, update the data structures to indicate that the block of data is not referred to;
determine from the data structures that a threshold number of contiguous blocks of data in the first logical container that are not referred to has been reached; and
make available for storage portions of the one or more physical media corresponding to the contiguous blocks of data in the first logical container,wherein the data structures and the databases are not part of the native file systems of the storage devices.
4 Assignments
0 Petitions
Accused Products
Abstract
Described in detail herein are systems and methods for single instancing blocks of data in a data storage system. For example, the data storage system may include multiple computing devices (e.g., client computing devices) that store primary data. The data storage system may also include a secondary storage computing device, a single instance database, and one or more storage devices that store copies of the primary data (e.g., secondary copies, tertiary copies, etc.). The secondary storage computing device receives blocks of data from the computing devices and accesses the single instance database to determine whether the blocks of data are unique (meaning that no instances of the blocks of data are stored on the storage devices). If a block of data is unique, the single instance database stores it on a storage device. If not, the secondary storage computing device can avoid storing the block of data on the storage devices.
-
Citations
20 Claims
-
1. A computing system for reclaiming storage space on one or more storage devices having native file systems, wherein the storage space is utilized by one or more logical containers to store deduplicated blocks of data, wherein locations of the deduplicated blocks of data in the logical containers are not tracked by the native file systems of the storage devices, the computing system comprising:
-
one or more storage devices storing on physical media— one or more logical containers that include multiple deduplicated blocks of data that correspond to data objects; and one or more data structures that indicate whether the blocks of data are referred to; one or more databases storing information indicating whether the blocks of data are referred to; and a secondary storage computing device programmed to— receive an indication to remove a first set of blocks of data from a first logical container; for each of the blocks of data in the first set— determine, from the databases, whether the block of data is referred to; and if the block of data is not referred to, update the data structures to indicate that the block of data is not referred to; determine from the data structures that a threshold number of contiguous blocks of data in the first logical container that are not referred to has been reached; and make available for storage portions of the one or more physical media corresponding to the contiguous blocks of data in the first logical container, wherein the data structures and the databases are not part of the native file systems of the storage devices. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method of reclaiming storage space on one or more storage devices, wherein the storage space is utilized by one or more logical containers to store deduplicated blocks of data, and wherein the method is performed by a computing system having a processor and memory, the method comprising:
-
receiving an indication to remove a first data object, wherein the first data object is stored as multiple first blocks of data in at least a first logical container; accessing, by the computing system, a first data structure that indicates whether the first data object is referred to; determining that the first data object is not referred to; determining from a second data structure that a first number of multiple contiguous second blocks of data in the first logical container that are not referred to has been reached; and after determining that the first number has been reached, specifying as available for storage a portion of the first logical container corresponding to the multiple contiguous second blocks of data. - View Dependent Claims (9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A computing system for reclaiming storage space on one or more means for storing, wherein the storage space is utilized by one or more logical containers to store deduplicated blocks of data, the computing system comprising:
-
means for storing— one or more logical containers that include multiple deduplicated blocks of data that correspond to data objects; one or more first data structures that indicate whether the blocks of data are referred to by other data objects; and one or more second data structures that indicate whether the blocks of data are referred to by other data objects; means for receiving an indication to remove a first set of blocks of data from a first logical container; means for determining whether a block of data is referred to, for each of the blocks of data in the first set; means for updating the second data structures to indicate that a block of data is not referred to, for each of the blocks of data in the first set that is not referred to; means for determining from the second data structures that a first number of contiguous blocks of data in the first logical container that are not referred to has been reached; and means for specifying as available for storage portions of the one or more physical media corresponding to the contiguous blocks of data in the first logical container. - View Dependent Claims (18, 19, 20)
-
Specification