Client-side repository in a networked deduplicated storage system
First Claim
Patent Images
1. A method of restoring deduplicated data from secondary storage to a primary storage device, the method comprising:
- receiving at a storage manager that comprises at least computer memory, a request to restore to the primary storage device one or more files;
identifying a plurality of data blocks to be restored that correspond to the one or more files, wherein each of the one or more files comprises more than one data block,wherein the plurality of data blocks to be restored are stored in secondary storage and a first copy of deduplication signatures corresponding to the plurality of data blocks to be restored is stored in secondary storage, the first copy of the deduplication signatures is stored in association with one or more media agents, andwherein a portion of the plurality of data blocks are stored in a client-side repository, and a second copy of deduplication signatures corresponding to the portion of the plurality of data blocks is stored in the client-side repository remote from the secondary storage and local to the primary storage device;
determining a most recent backup time of each of the plurality of data blocks to be restored, wherein the most recent backup time indicates a most recent time at which a particular block of data was part of a backup operation;
based on the determined most recent backup time, removing one or more deduplication signatures from the set of deduplication signatures to form a revised set of deduplication signatures;
forming a plurality of bundles of deduplication signatures from the revised set of deduplication signatures;
transmitting the plurality of bundles of deduplication signatures to the client-side repository;
receiving an indication from the client-side repository as to which data blocks corresponding to the revised set of deduplication signatures are stored in the client-side repository, wherein the determination as to which data blocks are stored in the client-side repository is based on a comparison of the revised set of deduplication signatures with the second copy of deduplication signatures stored in the client-side repository; and
accessing data blocks not stored in the client-side repository from the secondary storage based on the first copy of deduplication signatures stored in association with the one or more media agents and transmitting the data blocks not stored in the client-side repository from the secondary storage to the primary storage device,wherein data blocks that are stored in the client-side repository are transmitted from the client-side repository to the primary storage device.
4 Assignments
0 Petitions
Accused Products
Abstract
A storage system according to certain embodiments includes a client-side repository (CSR). The CSR may communicate with a client at a higher data transfer rate than the rate used for communication between the client and secondary storage. During copy operations, for instance, some or all of the data being backed up or otherwise copied to secondary storage is stored in the CSR. During restore operations, copies of the data stored in the CSR is accessed from the CSR instead of from secondary storage, improving performance. Remaining data blocks not stored in the CSR can be restored from secondary storage.
-
Citations
8 Claims
-
1. A method of restoring deduplicated data from secondary storage to a primary storage device, the method comprising:
-
receiving at a storage manager that comprises at least computer memory, a request to restore to the primary storage device one or more files; identifying a plurality of data blocks to be restored that correspond to the one or more files, wherein each of the one or more files comprises more than one data block, wherein the plurality of data blocks to be restored are stored in secondary storage and a first copy of deduplication signatures corresponding to the plurality of data blocks to be restored is stored in secondary storage, the first copy of the deduplication signatures is stored in association with one or more media agents, and wherein a portion of the plurality of data blocks are stored in a client-side repository, and a second copy of deduplication signatures corresponding to the portion of the plurality of data blocks is stored in the client-side repository remote from the secondary storage and local to the primary storage device; determining a most recent backup time of each of the plurality of data blocks to be restored, wherein the most recent backup time indicates a most recent time at which a particular block of data was part of a backup operation; based on the determined most recent backup time, removing one or more deduplication signatures from the set of deduplication signatures to form a revised set of deduplication signatures; forming a plurality of bundles of deduplication signatures from the revised set of deduplication signatures; transmitting the plurality of bundles of deduplication signatures to the client-side repository; receiving an indication from the client-side repository as to which data blocks corresponding to the revised set of deduplication signatures are stored in the client-side repository, wherein the determination as to which data blocks are stored in the client-side repository is based on a comparison of the revised set of deduplication signatures with the second copy of deduplication signatures stored in the client-side repository; and accessing data blocks not stored in the client-side repository from the secondary storage based on the first copy of deduplication signatures stored in association with the one or more media agents and transmitting the data blocks not stored in the client-side repository from the secondary storage to the primary storage device, wherein data blocks that are stored in the client-side repository are transmitted from the client-side repository to the primary storage device. - View Dependent Claims (2, 3, 4)
-
-
5. A storage system comprising:
-
secondary storage storing, according to a deduplication scheme, a plurality of data blocks received from a client system, and unique-a first copy of deduplication signatures corresponding to the plurality of data blocks, the first copy of deduplication signatures is stored in association with one or more media agents, wherein each file in the client system comprises more than one data block, and wherein at least a portion of the plurality of data blocks, and a second copy of deduplication signatures corresponding to the portion of the plurality of data blocks, is stored in a client-side repository remote from the secondary storage; a storage manager that comprises at least computer memory configured to perform at least one restore operation in which data is restored to the client system, the storage manager configured to; receive a request to restore the plurality of data blocks to the client system; determine a most recent backup time of one or more of the plurality of data blocks, wherein the most recent backup time indicates a most recent time at which the one or more of the plurality of data blocks was part of a backup operation; determine that the most recent backup time for a set of one or more data blocks of the plurality of data blocks that does not meet a threshold time; remove one or more deduplication signatures corresponding to the set of the one or more data blocks from a set of deduplication signatures to form a revised set of deduplication signatures; form a plurality of bundles of data block queries from the revised set of deduplication signatures; transmit the plurality of bundles of data block queries to the client-side repository; receive an indication from the client-side repository as to which data blocks corresponding to the revised set of deduplication signatures are stored in the client-side repository based on a comparison of the revised set of deduplication signatures with the second copy of deduplication signatures stored in the client-side repository; and access the secondary storage to restore data blocks not stored in the client-side repository from the secondary storage to the client system, based on the first copy of the deduplication signatures stored in association with the one or more media agents and transmit the data blocks not stored in the client-side repository from the secondary storage to the primary storage device, and wherein data blocks that are stored in the client-side repository are restored from the client-side repository to the client system. - View Dependent Claims (6, 7, 8)
-
Specification