Method and apparatus for automated migration of data among storage centers
First Claim
1. An apparatus for controlling a storage of data among a plurality of regional storage centers operatively coupled through a network in a global storage system, the apparatus comprising:
- memory; and
at least one processor connected with the memory, the at least one processor being configured;
(i) to define at least one dataset comprising at least a subset of the data stored in the global storage system;
(ii) to define at least one ruleset for determining where to store the at least one dataset;
(iii) to track at least one attribute relating to at least one data requesting entity operating in the global storage system, the at least one data requesting entity requesting the at least one dataset; and
(iv) to determine, as a function of said at least one ruleset and said at least one attribute relating to the at least one data requesting entity, information regarding a location for storing the at least one dataset among the plurality of regional storage centers having available resources that reduces at least one of total distance traversed by the at least one dataset in serving the at least one data requesting entity and latency of delivery of the at least one dataset to the at least one data requesting entity;
wherein the at least one processor is further configured to determine whether a migration of the at least one dataset is worthwhile, and, when it is determined that migration is not worthwhile, continue to obtain information regarding the demand for the at least one dataset;
wherein determining whether a migration of the at least one dataset is worthwhile comprises determining a new location, L^, for the dataset, among the plurality of regional storage centers having available resources, which satisfies an expression a1·
d(l1, L^)+ . . . +an·
d(ln, L^)+b·
d(L′
, L^)≦
a1·
d(l1, L)+ . . . +an·
d(ln, L)+b·
d(L′
, L) for every other possible location L of the plurality of regional storage centers, where d(lj, lk) is a network distance function indicative of a distance between any two locations lj and lk, a1, . . . , an represent amounts of data of respective data transfers, l1, . . . , ln represent locations from where usage of the at least one dataset occurs, b represents a size of the at least one dataset, and L′
represents a location of a given one of the plurality of regional storage centers in which the at least one dataset resides prior to migration of the dataset.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for controlling the storage of data among multiple regional storage centers coupled through a network in a global storage system is provided. The method includes steps of: defining at least one dataset comprising at least a subset of the data stored in the global storage system; defining at least one ruleset for determining where to store the dataset; obtaining information regarding a demand for the dataset through one or more data requesting entities operating in the global storage system; and determining, as a function of the ruleset, information regarding a location for storing the dataset among regional storage centers having available resources that reduces the total distance traversed by the dataset in serving at least a given one of the data requesting entities and/or reduces the latency of delivery of the dataset to the given one of the data requesting entities.
16 Citations
10 Claims
-
1. An apparatus for controlling a storage of data among a plurality of regional storage centers operatively coupled through a network in a global storage system, the apparatus comprising:
-
memory; and at least one processor connected with the memory, the at least one processor being configured;
(i) to define at least one dataset comprising at least a subset of the data stored in the global storage system;
(ii) to define at least one ruleset for determining where to store the at least one dataset;
(iii) to track at least one attribute relating to at least one data requesting entity operating in the global storage system, the at least one data requesting entity requesting the at least one dataset; and
(iv) to determine, as a function of said at least one ruleset and said at least one attribute relating to the at least one data requesting entity, information regarding a location for storing the at least one dataset among the plurality of regional storage centers having available resources that reduces at least one of total distance traversed by the at least one dataset in serving the at least one data requesting entity and latency of delivery of the at least one dataset to the at least one data requesting entity;wherein the at least one processor is further configured to determine whether a migration of the at least one dataset is worthwhile, and, when it is determined that migration is not worthwhile, continue to obtain information regarding the demand for the at least one dataset;
wherein determining whether a migration of the at least one dataset is worthwhile comprises determining a new location, L^, for the dataset, among the plurality of regional storage centers having available resources, which satisfies an expression a1·
d(l1, L^)+ . . . +an·
d(ln, L^)+b·
d(L′
, L^)≦
a1·
d(l1, L)+ . . . +an·
d(ln, L)+b·
d(L′
, L) for every other possible location L of the plurality of regional storage centers, where d(lj, lk) is a network distance function indicative of a distance between any two locations lj and lk, a1, . . . , an represent amounts of data of respective data transfers, l1, . . . , ln represent locations from where usage of the at least one dataset occurs, b represents a size of the at least one dataset, and L′
represents a location of a given one of the plurality of regional storage centers in which the at least one dataset resides prior to migration of the dataset. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A non-transitory machine-accessible storage medium for controlling a storing of data among a plurality of regional storage centers operatively coupled through a network in a global storage system, the storage medium comprising:
-
computer readable program code configured to define at least one dataset comprising at least a subset of the data stored in the global storage system; computer readable program code configured to define at least one ruleset for determining where to store the at least one dataset; computer readable program code configured to obtain information regarding a demand for the at least one dataset through one or more data requesting entities operating in the global storage system; and computer readable program code configured to determine, as a function of the at least one ruleset, information regarding a location for storing the at least one dataset among a plurality of regional storage centers having available resources that reduces at least one of (i) total distance traversed by the at least one dataset in serving at least one of the one or more data requesting entities and (ii) latency of delivery of the at least one dataset to the at least one of the one or more data requesting entities; wherein the storage medium further comprises computer readable program code configured to determine whether a migration of the at least one dataset is worthwhile, and, when it is determined that migration is not worthwhile, to continue to obtain information regarding the demand for the at least one dataset;
wherein determining whether a migration of the at least one dataset is worthwhile comprises determining a new location, L^, for the dataset, among the plurality of regional storage centers having available resources, which satisfies an expression a1·
d(l1, L^)+ . . . +an·
d(ln, L^)+b·
d(L′
, L^)≦
a1·
d(l1, L)+ . . . +an·
d(ln, L)+b·
d(L′
, L) for every other possible location L of the plurality of regional storage centers, where d(lj, lk) is a network distance function indicative of a distance between any two locations lj and lk, a1, . . . , an represent amounts of data of respective data transfers, l1, . . . ln represent locations from where usage of the at least one dataset occurs, b represents a size of the at least one dataset, and L′
represents a location of a given one of the plurality of regional storage centers in which the at least one dataset resides prior to migration of the dataset. - View Dependent Claims (7, 8, 9, 10)
-
Specification