COLLABORATIVE BACKUP IN A NETWORKED STORAGE SYSTEM
First Claim
1. A method of generating a secondary copy data set for a client computing device by collaboratively sourcing data to be used in the secondary copy data set from at least one other client computing device, the method comprising:
- for each respective client computing device of a plurality of client computing devices;
monitoring storage of a plurality of files formed by data blocks generated by one or more software applications running on the respective client computing device,wherein the files are stored in a data store associated with the respective client computing device;
maintaining, by a signature repository agent executing on one or more processors, a global mapping indicating which data blocks are stored in the data stores associated with each of the plurality of client computing devices, wherein separate copies of at least some of the data blocks reside in the data stores of multiple ones of the plurality of client computing devices;
in response to instructions to create a secondary copy in secondary storage of at least a subset of the plurality of files stored in the data store of a first client computing device of the plurality of client computing devices,querying, by the signature repository agent, the global mapping to identify at least a first group of data blocks in the subset of the plurality of files that are stored in the data store associated with a second client computing device of the plurality of client computing devices;
retrieving the first group of data blocks from the data store associated with the second client computing device; and
retrieving at least some of the remaining data blocks in the first portion from the data store associated with the first client computing device.
3 Assignments
0 Petitions
Accused Products
Abstract
A storage system according to certain embodiments includes a client-side signature repository that includes information representative of a set of data blocks stored in primary storage. During copy or backup operations, the system can use the client-side signature repository to identify data blocks located in primary storage that are new or that have changed. The system can also use the client-side signature repository to identify multiple locations within primary storage where different instances of the data blocks are located. Accordingly, during a copy or backup operation of one client computing device, the system can source a data block that is to be copied to secondary storage from another client computing device that includes a second instance of the data block.
234 Citations
24 Claims
-
1. A method of generating a secondary copy data set for a client computing device by collaboratively sourcing data to be used in the secondary copy data set from at least one other client computing device, the method comprising:
-
for each respective client computing device of a plurality of client computing devices; monitoring storage of a plurality of files formed by data blocks generated by one or more software applications running on the respective client computing device, wherein the files are stored in a data store associated with the respective client computing device; maintaining, by a signature repository agent executing on one or more processors, a global mapping indicating which data blocks are stored in the data stores associated with each of the plurality of client computing devices, wherein separate copies of at least some of the data blocks reside in the data stores of multiple ones of the plurality of client computing devices; in response to instructions to create a secondary copy in secondary storage of at least a subset of the plurality of files stored in the data store of a first client computing device of the plurality of client computing devices, querying, by the signature repository agent, the global mapping to identify at least a first group of data blocks in the subset of the plurality of files that are stored in the data store associated with a second client computing device of the plurality of client computing devices; retrieving the first group of data blocks from the data store associated with the second client computing device; and retrieving at least some of the remaining data blocks in the first portion from the data store associated with the first client computing device. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A storage system for generating a secondary copy data set for a client computing device by collaboratively sourcing data to be used in the secondary copy data set from at least one other client computing device, the system comprising:
-
a global mapping stored in one or more storage devices and indicating which data blocks are stored in data stores associated with a plurality of client computing devices, wherein the data blocks are generated by one or more software applications running on the plurality of client computing devices; a signature repository agent executing on a computing device and configured to; maintain the global mapping indicating which data blocks are stored in the data stores associated with each of the plurality of client computing devices, wherein separate copies of at least some of the data blocks reside in the data stores of multiple ones of the plurality of client computing devices; in response to instructions to create a secondary copy in secondary storage of at least a subset of data blocks stored in the data store associated with the first client computing device, query the global mapping to identify at least a first group of data blocks in the subset of data blocks that are stored in the data store associated with a second client computing device of the plurality of client computing devices; wherein the first group of data blocks is retrieved from the data store associated with the second client computing device; and wherein at least some of the remaining data blocks in the subset of data blocks are retrieved from the data store associated with the first client computing device. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
-
-
23. A computer-readable, non-transitory storage medium having one or more computer-executable modules for generating a secondary copy data set for a client computing device by collaboratively sourcing data to be used in the secondary copy data set from at least one other client computing device, the one or more computer-executable modules comprising:
a first module in communication with a plurality of client computing devices and configured to; maintain a global mapping indicating which data blocks are stored in data stores associated with a plurality of client computing devices, wherein the data blocks are generated by one or more software applications running on the plurality of client computing devices, wherein separate copies of at least some of the data blocks reside in the data stores of multiple ones of the plurality of client computing devices; in response to instructions to create a secondary copy in secondary storage of at least a subset of data blocks stored in the data store associated with the first client computing device, query the global mapping to identify at least a first group of data blocks in the subset of data blocks that are stored in the data store associated with the second client computing device; wherein the first group of data blocks is retrieved from the data store associated with the second client computing device; and wherein at least some of the remaining data blocks in the subset of data blocks are retrieved from the data store associated with the first client computing device. - View Dependent Claims (24)
Specification