Methods and apparatus for data lifecycle analysis
First Claim
1. A computer implemented method comprising:
- providing a history of data lifecycles for a plurality of data sets protected via a backup store having one or more storage devices, wherein each data lifecycle represents a data path across one or more of the storage devices in time during which a data set has been copied or moved according to one of a first set of policies for backup maintained by the backup store, the data path including information recording the one or more storage devices and time the data set was copied or moved at each of the one or more storage devices;
determining, by a similarity analysis module, similarity measures among the data lifecycles, wherein the similarity measures indicate whether data path of the data lifecycles accessing the data sets are similar, including determining whether any of data paths of the lifecycles leads to a common target storage device;
consolidating, by a policy updating module, the first set of policies into a second set of policies based on the similarity measures of the data lifecycles, wherein a number of the second set of policies is smaller than a number of the first set of policies, wherein at least two or more of the first set of policies are combined into one single policy, wherein the two or more of the first set of policies are similar according to the similarity measures; and
in response to a client request to protect a particular data set, configuring the storage devices to back up the particular data set according to one of the second set of policies that is associated with the particular data set.
9 Assignments
0 Petitions
Accused Products
Abstract
Methods and apparatuses to determine similarity among data lifecycles of data sets protected via a backup store having one or more storage devices are described. Each data set may be associated with one data lifecycle indicating a schedule to store one or more copies of the data set in the storage devices. The backup store can have one or more polices. Each life cycle may be specified in one of the policies. Two or more of the policies may be consolidated into one single policy specifying an updated data lifecycle. In one embodiment, the updated data lifecycle and data lifecycles of the two or more polices may be similar according to the similarity determined. A particular one of the data set may be associated with one of the data lifecycles of the two or more polices. The storage device may be configured to back up the particular data set according to the updated data lifecycle of the one single policy.
121 Citations
21 Claims
-
1. A computer implemented method comprising:
-
providing a history of data lifecycles for a plurality of data sets protected via a backup store having one or more storage devices, wherein each data lifecycle represents a data path across one or more of the storage devices in time during which a data set has been copied or moved according to one of a first set of policies for backup maintained by the backup store, the data path including information recording the one or more storage devices and time the data set was copied or moved at each of the one or more storage devices; determining, by a similarity analysis module, similarity measures among the data lifecycles, wherein the similarity measures indicate whether data path of the data lifecycles accessing the data sets are similar, including determining whether any of data paths of the lifecycles leads to a common target storage device; consolidating, by a policy updating module, the first set of policies into a second set of policies based on the similarity measures of the data lifecycles, wherein a number of the second set of policies is smaller than a number of the first set of policies, wherein at least two or more of the first set of policies are combined into one single policy, wherein the two or more of the first set of policies are similar according to the similarity measures; and in response to a client request to protect a particular data set, configuring the storage devices to back up the particular data set according to one of the second set of policies that is associated with the particular data set. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A non-transient machine readable storage medium having instructions therein, which when executed by a machine, cause the machine to perform operations, the operations comprising:
-
providing a history of data lifecycles for a plurality of data sets protected via a backup store having one or more storage devices, wherein each data lifecycle represents a data path across one or more of the storage devices in time during which a data set has been copied or moved according to one of a first set of policies for backup maintained by the backup store, the data path including information recording the one or more storage devices and time the data set was copied or moved at each of the one or more storage devices; determining similarity measures among the data lifecycles, wherein the similarity measures indicate whether data paths of the data lifecycles accessing the data sets are similar, including determining whether any of data paths of the lifecycles leads to a common target storage device; consolidating the first set of policies into a second set of policies based on the similarity measures, wherein a number of the second set of policies is smaller than a number of the first set of policies, wherein at least two or more of the policies are combined into one single policy, wherein the two or more of the first set of policies are similar according to the similarity measures; and in response to a client request to protect a particular data set, configuring the storage devices to back up the particular data set according to one of the second set of policies that is associated with the particular data set. - View Dependent Claims (11, 12, 13, 14, 15, 16)
-
-
17. A computer system comprising:
-
a memory storing executable instructions; a network interface coupled to one or more storage devices; a processor coupled to the memory and the network interface to execute the instructions from the memory, the processor being configured to provide a history of data lifecycles for a plurality of data sets protected via a backup store having one or more storage devices, wherein each data lifecycle represents a data path across one or more of the storage devices in time during which a data set has been copied or moved according to one of a first set of policies for backup maintained by the backup store, the data path including information recording the one or more storage devices and time the data set was copied or moved at each of the one or more storage devices; determine similarity measures among the data lifecycles, wherein the similarity measures indicate whether data paths of the data lifecycles accessing the data sets are similar, including determining whether any of data paths of the lifecycle leads to a common target storage device; consolidate the first set of policies into a second set of policies based on the similarity measures, wherein a number of the second set of policies is smaller than a number of the first set of policies, wherein at least two or more of the policies are combined into one single policy, wherein the two or more of the first set of policies are similar according to the similarity measures, and in response to a client request to protect a particular data set, configure the storage devices to back up the particular data set according to one of the second set of policies that is associated with the particular data set. - View Dependent Claims (18, 19, 20, 21)
-
Specification