High availability distributed deduplicated storage system
First Claim
Patent Images
1. A method of performing a storage operation in a distributed, deduplicated storage system, comprising:
- receiving at a first secondary storage computing device of a plurality of secondary storage computing devices a request from a client computing device to backup a file comprising a plurality of data blocks and stored in primary storage,wherein a first deduplication database computing device of a plurality of deduplication database computing devices communicatively coupled to the first secondary storage computing device is configured to store a first subset of signature blocks based at least in part on a data block distribution policy and is designated as a failover deduplication database computing device for a second deduplication database computing device of the plurality of deduplication database computing devices that is configured to store, based at least in part on the data block distribution policy, a second subset of the signature blocks that is different from and does not overlap with the first subset of the signature blocks,wherein the plurality of deduplication database computing devices store the signature blocks corresponding to data blocks stored in secondary storage, wherein the data blocks stored in the secondary storage correspond to data blocks stored in primary storage, at least one signature block of the signature blocks comprising a signature of at least one data block of the plurality of data blocks, location information of the at least one data block in the secondary storage, and a reference count value indicative of a quantity of one or more references in the secondary storage to the at least one data block;
in response to the request and using one or more processors, calculating a signature of a particular data block of the plurality of data blocks using a signature function;
identifying the second deduplication database computing device as the deduplication database computing device assigned to store the signature of the particular data block;
determining that the second deduplication database computing device is unavailable; and
querying the first deduplication database computing device for the signature of the particular data block,the method further comprising at least one of;
based at least in part on an indication from the first deduplication database computing device that the signature of the particular data block does not reside in the first deduplication database computing device, store the signature in a failover index, cause at least one storage device of the of the secondary storage to store a copy of the particular data block, and request the first deduplication database computing device to store the signature of the particular data block and a location of the copy of the particular data block, orbased at least in part on an indication from the first deduplication database computing device that the signature of the particular data block resides in the first deduplication database computing device, cause at least one storage device of the of the secondary storage to store a reference to a copy of the particular data block that is stored in the secondary storage.
4 Assignments
0 Petitions
Accused Products
Abstract
A high availability distributed, deduplicated storage system according to certain embodiments is arranged to include multiple deduplication database media agents. The deduplication database media agents store signatures of data blocks stored in secondary storage. In addition, the deduplication database media agents are configured as failover deduplication database media agents in the event that one of the deduplication database media agents becomes unavailable.
-
Citations
12 Claims
-
1. A method of performing a storage operation in a distributed, deduplicated storage system, comprising:
-
receiving at a first secondary storage computing device of a plurality of secondary storage computing devices a request from a client computing device to backup a file comprising a plurality of data blocks and stored in primary storage, wherein a first deduplication database computing device of a plurality of deduplication database computing devices communicatively coupled to the first secondary storage computing device is configured to store a first subset of signature blocks based at least in part on a data block distribution policy and is designated as a failover deduplication database computing device for a second deduplication database computing device of the plurality of deduplication database computing devices that is configured to store, based at least in part on the data block distribution policy, a second subset of the signature blocks that is different from and does not overlap with the first subset of the signature blocks, wherein the plurality of deduplication database computing devices store the signature blocks corresponding to data blocks stored in secondary storage, wherein the data blocks stored in the secondary storage correspond to data blocks stored in primary storage, at least one signature block of the signature blocks comprising a signature of at least one data block of the plurality of data blocks, location information of the at least one data block in the secondary storage, and a reference count value indicative of a quantity of one or more references in the secondary storage to the at least one data block; in response to the request and using one or more processors, calculating a signature of a particular data block of the plurality of data blocks using a signature function; identifying the second deduplication database computing device as the deduplication database computing device assigned to store the signature of the particular data block; determining that the second deduplication database computing device is unavailable; and querying the first deduplication database computing device for the signature of the particular data block, the method further comprising at least one of; based at least in part on an indication from the first deduplication database computing device that the signature of the particular data block does not reside in the first deduplication database computing device, store the signature in a failover index, cause at least one storage device of the of the secondary storage to store a copy of the particular data block, and request the first deduplication database computing device to store the signature of the particular data block and a location of the copy of the particular data block, or based at least in part on an indication from the first deduplication database computing device that the signature of the particular data block resides in the first deduplication database computing device, cause at least one storage device of the of the secondary storage to store a reference to a copy of the particular data block that is stored in the secondary storage. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A distributed deduplicated storage system, comprising:
-
a plurality of deduplication database computing devices configured to store signature blocks corresponding to a plurality of data blocks stored in one or more storage devices of secondary storage, at least one signature block of the signature blocks comprising a signature of at least one data block of the plurality of data blocks, location information of the at least one data block in the one or more storage devices, and a reference count value indicative of a quantity of one or more references in the secondary storage to the at least one data block, wherein a first deduplication database computing device of the plurality of deduplication database computing devices is configured to store a first subset of the signature blocks based at least in part on a data block distribution policy and is designated as a failover deduplication database computing device for a second deduplication database computing device of the plurality of deduplication database computing devices that is configured to store, based at least in part on the data block distribution policy, a second subset of the signature blocks that is different from and does not overlap with the first subset of the signature blocks; and a plurality of secondary storage computing devices communicatively coupled to the plurality of deduplication database computing devices, each of the plurality of secondary storage computing devices comprising one or more processors and storage, wherein at least one secondary storage computing device of the plurality of secondary storage computing devices further comprises a failover index and is configured to; receive a request to backup a file comprising a plurality of data blocks and stored in primary storage, calculate a signature of a particular data block of the plurality of data blocks using a signature function, identify the second deduplication database computing device as the deduplication database computing device assigned to store the signature of the particular data block, determine that the second deduplication database computing device is unavailable, and query the first deduplication database computing device for the signature of the particular data block, wherein the at least one secondary storage computing device is further configured to at least one of; based at least in part on an indication from the first deduplication database computing device that the signature of the particular data block does not reside in the first deduplication database computing device, store the signature in the failover index, cause at least one storage device of the one or more storage devices of the secondary storage to store a copy of the particular data block, and request the first deduplication database computing device to store the signature of the particular data block and a location of the copy of the particular data block, or based at least in part on an indication from the first deduplication database computing device that the signature of the particular data block resides in the first deduplication database computing device, cause at least one storage device of the one or more storage devices of the secondary storage to store a reference to a copy of the particular data block that is stored in the secondary storage. - View Dependent Claims (8, 9, 10, 11, 12)
-
Specification