Dynamic placement of replica data
First Claim
1. A system that facilitates allocation of replicas among a set of storage nodes in a hybrid backup environment, the hybrid backup environment including one or more storage nodes located in a cloud storage location of a cloud backup environment, one or more storage nodes located in a peer-to-peer backup environment and one or more peers of the peer-to-peer backup environment, the system comprising:
- a processor coupled to a memory that retains computer-executable instructions, wherein the processor executes;
(A) a replication component that;
identifies properties of a portion of data,evaluates both the one or more storage nodes located in the cloud storage location of the cloud backup environment and the one or more storage nodes located in the peer-to-peer backup environment to identify characteristics of the one or more storage nodes located in the cloud storage location of the cloud backup environment and the one or more storage nodes located in the peer-to-peer backup environment,wherein the one or more storage nodes located in the cloud storage location and the one or more storage nodes located in the peer-to-peer backup environment are disjoint sets; and
wherein the one or more peers of the peer-to-peer backup environment consist of one or more devices and the cloud storage location is accessible to the one or more devices via a network, the cloud storage interacting with the one or more peers via the network, andgenerates a replica requirement for the portion of data based at least in part on an analysis of (1) the identified properties of the portion of data and (2) the identified characteristics of both the one or more storage nodes located in the cloud storage location of the cloud backup environment and the one or more storage nodes located in the peer-to-peer backup environment, the one or more storage nodes located in the cloud storage location and the one or more storage nodes located in the peer-to-peer backup environment being disjoint sets;
wherein the identified properties include the size of the portion of data and at least one of;
compressibility of the portion of data and reparability of the portion of data; and
(B) a placement component that;
generates a placement policy based, at least in part on (1) the identified characteristics of the one or more storage nodes located in the cloud storage location of the cloud backup environment and the identified characteristics of the one or more storage nodes located in the peer-to-peer backup environment, the one or more storage nodes located in the cloud storage location and the one or more storage nodes located in the peer-to-peer backup environment being disjoint sets, the identified characteristics including one or more of;
availability of a storage node, available storage capacity of the storage node, cost of storage corresponding to the storage node, cost of data transfer to or from the storage node, and network locality of the storage node relative to an origin node, and (2) user preferences, comprising (i) a weighting of;
each one of the identified characteristics of the one or more storage nodes located in the cloud storage location of the cloud backup environment and the identified characteristics of the one or more storage nodes located in the peer-to-peer backup environment used in generating the placement policy, the one or more storage nodes located in the cloud storage location and the one or more storage nodes located in the peer-to-peer backup environment being disjoint sets, and (ii) an identification of a preferred storage node; and
distributes one or more replicas of the portion of data among both the one or more storage nodes located in the cloud storage location of the cloud backup environment and the one or more storage nodes located in the peer-to-peer backup environment of the hybrid backup environment, based on the replica requirement, the placement policy, the identified characteristics of the one or more storage nodes located in the cloud storage location of the cloud backup environment, and the identified characteristics of the one or more storage nodes located in the peer-to-peer backup environment, the one or more storage nodes located in the cloud storage location and the one or more storage nodes located in the peer-to-peer backup environment being disjoint sets,(C) an observation component that monitors the one or more storage nodes located in the cloud backup environment and the one or more storage nodes located in the peer-to-peer backup environment to identify changes in the characteristics thereof.
2 Assignments
0 Petitions
Accused Products
Abstract
The claimed subject matter relates to systems and/or methodologies that facilitate distributed storage of data. A distributed file system can be implemented on storage nodes such that the system places multiple copies of data (e.g., replicas) on a variety of disparate storage nodes to guarantee availability of the data and minimize loss of the data. Storage nodes are dynamically evaluated to identify respective characteristics. In one example, the characteristics can include availability of a storage node, capacity of a storage node, data storage cost associated with a storage node, data transfer costs associated with a storage node, locality of a storage node, network topology, or user preferences associated with a storage node. The characteristics can be employed to generate optimal placements decisions.
233 Citations
11 Claims
-
1. A system that facilitates allocation of replicas among a set of storage nodes in a hybrid backup environment, the hybrid backup environment including one or more storage nodes located in a cloud storage location of a cloud backup environment, one or more storage nodes located in a peer-to-peer backup environment and one or more peers of the peer-to-peer backup environment, the system comprising:
-
a processor coupled to a memory that retains computer-executable instructions, wherein the processor executes; (A) a replication component that; identifies properties of a portion of data, evaluates both the one or more storage nodes located in the cloud storage location of the cloud backup environment and the one or more storage nodes located in the peer-to-peer backup environment to identify characteristics of the one or more storage nodes located in the cloud storage location of the cloud backup environment and the one or more storage nodes located in the peer-to-peer backup environment, wherein the one or more storage nodes located in the cloud storage location and the one or more storage nodes located in the peer-to-peer backup environment are disjoint sets; and
wherein the one or more peers of the peer-to-peer backup environment consist of one or more devices and the cloud storage location is accessible to the one or more devices via a network, the cloud storage interacting with the one or more peers via the network, andgenerates a replica requirement for the portion of data based at least in part on an analysis of (1) the identified properties of the portion of data and (2) the identified characteristics of both the one or more storage nodes located in the cloud storage location of the cloud backup environment and the one or more storage nodes located in the peer-to-peer backup environment, the one or more storage nodes located in the cloud storage location and the one or more storage nodes located in the peer-to-peer backup environment being disjoint sets;
wherein the identified properties include the size of the portion of data and at least one of;
compressibility of the portion of data and reparability of the portion of data; and(B) a placement component that; generates a placement policy based, at least in part on (1) the identified characteristics of the one or more storage nodes located in the cloud storage location of the cloud backup environment and the identified characteristics of the one or more storage nodes located in the peer-to-peer backup environment, the one or more storage nodes located in the cloud storage location and the one or more storage nodes located in the peer-to-peer backup environment being disjoint sets, the identified characteristics including one or more of;
availability of a storage node, available storage capacity of the storage node, cost of storage corresponding to the storage node, cost of data transfer to or from the storage node, and network locality of the storage node relative to an origin node, and (2) user preferences, comprising (i) a weighting of;
each one of the identified characteristics of the one or more storage nodes located in the cloud storage location of the cloud backup environment and the identified characteristics of the one or more storage nodes located in the peer-to-peer backup environment used in generating the placement policy, the one or more storage nodes located in the cloud storage location and the one or more storage nodes located in the peer-to-peer backup environment being disjoint sets, and (ii) an identification of a preferred storage node; anddistributes one or more replicas of the portion of data among both the one or more storage nodes located in the cloud storage location of the cloud backup environment and the one or more storage nodes located in the peer-to-peer backup environment of the hybrid backup environment, based on the replica requirement, the placement policy, the identified characteristics of the one or more storage nodes located in the cloud storage location of the cloud backup environment, and the identified characteristics of the one or more storage nodes located in the peer-to-peer backup environment, the one or more storage nodes located in the cloud storage location and the one or more storage nodes located in the peer-to-peer backup environment being disjoint sets, (C) an observation component that monitors the one or more storage nodes located in the cloud backup environment and the one or more storage nodes located in the peer-to-peer backup environment to identify changes in the characteristics thereof. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method for replicating data across one or more storage locations in a hybrid backup environment, the hybrid backup environment including one or more storage nodes located in a cloud storage location of a cloud backup environment, one or more storage nodes located in a peer-to-peer backup environment and one or more peers of the peer-to-peer backup environment, the method comprising:
-
employing a processor executing computer-executable instructions stored on a computer-readable storage device that retains computer-executable instructions, to implement the following acts; assigning characteristics to both the one or more storage nodes located in the cloud storage location of the cloud backup environment and the one or more storage nodes located in the peer-to-peer backup environment, the one or more storage nodes located in the cloud storage location and the one or more storage nodes located in the peer-to-peer backup environment being disjoint sets; wherein the one or more peers of the peer-to-peer backup environment consist of one or more devices and the cloud storage location is accessible to the one or more devices via a network; identifying properties of a portion of data, wherein the identified properties include the size of the portion of data and at least one of;
compressibility of the portion of data and reparability of the portion of data,specifying a replica requirement for the portion of data based at least in part on (1) an analysis of the identified properties of the portion of data and (2) the assigned characteristics of both the one or more storage nodes located in the cloud storage location of the cloud backup environment and the one or more storage nodes located in the peer-to-peer backup environment, the one or more storage nodes located in the cloud storage location and the one or more storage nodes located in the peer-to-peer backup environment being disjoint sets; generating a placement policy based, at least in part on (1) each one or the assigned characteristics of the one or more storage nodes located in the cloud storage location of the cloud backup environment and the identified characteristics of the one or more storage nodes located in the peer-to-peer backup environment, the one or more storage nodes located in the cloud storage location of the one or more storage nodes located in the peer-to-peer backup environment being disjoint sets, and (2) a user preference, the assigned characteristics including one or more of;
availability of a storage node, available storage capacity of the storage node, cost of storage corresponding to the storage node, cost of data transfer to or from the storage node, and network locality of the storage node relative to an origin node, andwherein the user preference comprises (i) a weighting of each one of the assigned characteristics of both the one or more storage nodes located in the cloud storage location of the cloud backup environment and the one or more storage nodes located in the peer-to-peer backup environment used in generating the placement policy, the one or more storage nodes located in the cloud storage location and the one or more storage nodes located in the peer-to-peer backup environment being disjoint sets, and (ii) an identification of a preferred storage node; and placing one or more replicas of the portion of data on both the one of the one or more storage nodes located in the cloud storage location of the cloud backup environment and the one or more storage nodes located in the peer-to-peer backup environment of the hybrid backup environment, based on the replica requirement, the placement policy, the assigned characteristics of the one or more storage nodes located in the cloud storage location of the cloud backup environment, and the assigned characteristics of the one or more storage nodes located in the peer-to-peer backup environment, the one or more storage nodes located in the cloud storage location and the one or more storage nodes located in the peer-to-peer backup environment being disjoint sets, wherein the cloud storage interacts with the one or more peers of the peer-to-peer backup environment, via the network, and monitoring the one or more storage nodes located in the cloud backup environment and the one or more storage nodes located in the peer-to-peer backup environment to detect changes in the characteristics thereof and assigning the changed characteristics thereto.
-
Specification