INTELLIGENT DATA SOURCING IN A NETWORKED STORAGE SYSTEM
First Claim
1. A method of sourcing data from storage associated with a pool of computing devices during a data storage operation associated with one of the computing devices in the pool, the method comprising:
- obtaining signatures corresponding to data units that form a data set associated with a data storage operation, the data set corresponding to a version of primary data of a first computing device in a pool of a plurality of computing devices, each respective computing device in the pool storing primary data generated by one or more software applications executing on the respective computing device, the primary data stored in at least one storage device associated with the respective computing device;
populating, by one or more processors, a shared signature repository that includes;
signatures corresponding to at least each unique data unit stored in the storage devices of the computing devices in the pool; and
for each signature included in the signature repository, an indication as to one or more of the computing devices whose at least one storage device includes a copy of the data unit corresponding to the signature;
comparing the obtained signatures with the signature repository to identify one or more of the computing devices in the pool whose respective at least one storage devices include copies of data units in the data set;
consulting, by one or more processors, a priority policy; and
based on the priority policy, and for at least some data units in the backup set, deciding to access copies of the at least some data units from one or more computing devices in the pool other than the first computing device.
4 Assignments
0 Petitions
Accused Products
Abstract
A storage system according to certain embodiments includes a repository of client-side data block signature information representative of a set of data blocks stored in a primary storage subsystem. In some cases, the system sources data blocks for secondary copy and restore operations from the primary storage subsystem instead of the secondary storage subsystem. Where multiple primary storage components (e.g., multiple client computing devices) contain copies of a data blocks involved in a secondary copy or restore operation, the system can decide which client to source the data block from based on sourcing criteria.
-
Citations
20 Claims
-
1. A method of sourcing data from storage associated with a pool of computing devices during a data storage operation associated with one of the computing devices in the pool, the method comprising:
-
obtaining signatures corresponding to data units that form a data set associated with a data storage operation, the data set corresponding to a version of primary data of a first computing device in a pool of a plurality of computing devices, each respective computing device in the pool storing primary data generated by one or more software applications executing on the respective computing device, the primary data stored in at least one storage device associated with the respective computing device; populating, by one or more processors, a shared signature repository that includes; signatures corresponding to at least each unique data unit stored in the storage devices of the computing devices in the pool; and for each signature included in the signature repository, an indication as to one or more of the computing devices whose at least one storage device includes a copy of the data unit corresponding to the signature; comparing the obtained signatures with the signature repository to identify one or more of the computing devices in the pool whose respective at least one storage devices include copies of data units in the data set; consulting, by one or more processors, a priority policy; and based on the priority policy, and for at least some data units in the backup set, deciding to access copies of the at least some data units from one or more computing devices in the pool other than the first computing device. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A storage system for sourcing data from storage associated with a pool of computing devices during a data storage operation associated with one of the computing devices in the pool, the storage system comprising:
-
a global signature repository including; signatures corresponding to at least each unique data unit stored in at least one storage device associated with each of a plurality of computing devices in a pool; and for each signature included in the signature repository, an indication as to one or more of the plurality of computing devices whose at least one storage device includes a copy of the data unit corresponding to the signature; and a repository agent executing in one or more processors and configured to; obtain signatures corresponding to data units that form a data set associated with a data storage operation, the data set corresponding to a version of primary data of a first computing device in the pool, each respective computing device in the pool storing primary data generated by one or more software applications executing on the respective computing device and stored in at least one storage device associated with the respective computing device; compare the obtained signatures with the signature repository to identify one or more of the computing devices in the pool whose respective at least one storage devices include copies of data units in the data set; consult a priority policy; and based on the priority policy and for at least some data units in the backup set, decide to access copies of the at least some data units from one or more computing devices in the pool other than the first computing device. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification