Cluster storage collection based data management

US 7,346,734 B2
Filed: 05/25/2005
Issued: 03/18/2008
Est. Priority Date: 05/25/2005
Status: Expired due to Fees

First Claim

Patent Images

1. In a distributed system for storing data across a network to multiple data storage nodes, a method comprising:

determining a bounded bandwidth available for data repair in the distributed system;

creating a specific number of stripes on each data storage node of the multiple data storage nodes, the stripes for placement and replication of data objects across respective ones of the data storage nodes, the specific number of stripes on each data storage node being a function of the bounded bandwidth; and

wherein the creating further comprises;

(a) calculating a target chunk size based on a replication degree such that data reliability of storing data across the multiple data storage nodes using a random data placement scheme is optimized, the calculating comprising analyzing data reliability of the distributed system in view of the bounded bandwidth by estimating a mean time to data loss (MTTDL) for an object (MTTDL_obj) of multiple objects as a function of a harmonic sum of MTTDL in each state i of multiple states i, the distributed system comprising the multiple objects, each state i representing a state of the distributed system when i data storage nodes fail and lost replicas on the i data storage nodes have not been repaired; and

(b) allocating disk storage space on each node as a function of the target chunk size.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Cluster storage collection-based data management is described. In one aspect, and in a distributed system for storing data across a network to multiple data storage nodes, a bounded bandwidth available for data repair in the distributed system is determined. A specific number of stripes are then created on each data storage node of the multiple data storage nodes. The stripes are for placement and replication of data objects across respective ones of the data storage nodes. The specific number of stripes created on each data storage node is a function of the determined bounded data repair bandwidth.

Citations

20 Claims

1. In a distributed system for storing data across a network to multiple data storage nodes, a method comprising:
- determining a bounded bandwidth available for data repair in the distributed system;
  
  creating a specific number of stripes on each data storage node of the multiple data storage nodes, the stripes for placement and replication of data objects across respective ones of the data storage nodes, the specific number of stripes on each data storage node being a function of the bounded bandwidth; and
  
  wherein the creating further comprises;
  
  (a) calculating a target chunk size based on a replication degree such that data reliability of storing data across the multiple data storage nodes using a random data placement scheme is optimized, the calculating comprising analyzing data reliability of the distributed system in view of the bounded bandwidth by estimating a mean time to data loss (MTTDL) for an object (MTTDL_obj) of multiple objects as a function of a harmonic sum of MTTDL in each state i of multiple states i, the distributed system comprising the multiple objects, each state i representing a state of the distributed system when i data storage nodes fail and lost replicas on the i data storage nodes have not been repaired; and
  
  (b) allocating disk storage space on each node as a function of the target chunk size.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, wherein the bounded bandwidth is a function of bandwidth of a network and bandwidth of a data storage node of the data storage nodes.
  - 3. The method of claim 1, wherein the target chunk size is based on bandwidth of a computer-readable medium for storing data on a data storage node of the data storage nodes, storage capacity of the computer-readable medium, and network bandwidth.
  - 4. The method of claim 1, wherein the analyzing further comprises estimating a mean time to repair (MTTR) all failed replicas in a state i (MTTR(i)) of the multiple states i, the MTTR(i) being a function of an amount of data to be repaired in the state i and available bandwidth for repair in the state i.
  - 5. The method of claim 4, wherein the amount of data to be repaired in the state i is based on an amount of data in a last failed computing device of the i data storage nodes, and an amount of un-repaired data left on the last failed computing device from a previous state i.
  - 6. The method of claim 1, wherein the method further comprises randomly placing a single chunk of collected data objects on a randomly selected data storage node, the single chunk having been generated to correspond to the target chunk size, the single chunk being a unit for placement and repair.
  - 7. The method of claim 1, wherein the method further comprises:
    - grouping two or more data objects that are smaller than a target chunk size into a unit for placement and repair, and wherein the target chunk size is further based on bounded network bandwidth and brick bandwidth such that reliability of storing data across the multiple data storage nodes using a random data placement scheme is optimized; and
      
      randomly placing the unit for placement and repair on a randomly selected data storage node.
  - 8. The method of claim 1, further comprising:
    - collecting multiple data objects from data storage requests;
      
      determining that collected ones of the multiple data objects meet a collective target chunk size criteria;
      
      responsive to the determining, grouping the multiple data objects into a single chunk;
      
      randomly selecting a data storage node and a corresponding stripe of the stripes for data placement; and
      
      storing the single chunk of collected data objects onto the stripe using a replication data storage scheme.

9. In a distributed system for storing data across a network to multiple data storage nodes, a computing device comprising:
- a processor coupled to a memory, the memory comprising computer-program instructions executable by the processor for performing operations including;
  
  determining a bounded bandwidth available for data repair in the distributed system;
  
  creating a specific number of stripes on each data storage node of the multiple data storage nodes, the stripes for placement and replication of data objects across respective ones of the data storage nodes, the specific number of stripes on each data storage node being a function of the bounded bandwidth; and
  
  wherein the creating further comprises;
  
  (a) calculating a target chunk size based on a replication degree such that data reliability of storing data across the multiple data storage nodes using a random data placement scheme is optimized, the calculating comprising analyzing data reliability of the distributed system in view of the bounded bandwidth by estimating a mean time to data loss (MTTDL) for an object (MTTDL_obj) of multiple objects as a function of a harmonic sum of MTTDL in each state i of multiple states i, the distributed system comprising the multiple objects, each state i representing a state of the distributed system when i data storage nodes fail and lost replicas on the i data storage nodes have not been repaired; and
  
  (b) allocating disk storage space on each node as a function of the target chunk size.
- View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
- - 10. The computing device of claim 9, wherein the bounded bandwidth is a function of bandwidth of a network and bandwidth of a data storage node of the data storage nodes.
  - 11. The computing device of claim 9, wherein the target chunk size is based on bandwidth of a computer-readable medium for storing data on a data storage node of the data storage nodes, storage capacity of the computer-readable medium, and network bandwidth.
  - 12. The computing device of claim 9, wherein the analyzing further comprises estimating a mean time to repair (MTTR) all failed replicas in a state i (MTTR(i)) of the multiple states i, the MTTR(i) being a function of an amount of data to be repaired in the state i and available bandwidth for repair in the state i.
  - 13. The computing device of claim 12, wherein the amount of data to be repaired in the state i is based on an amount of data in a last failed computing device of the i data storage nodes, and an amount of un-repaired data left on the last failed computing device from a previous state i.
  - 14. The computing device of claim 9, wherein the computer-program instructions further comprise instructions for randomly placing a single chunk of collected data objects on a randomly selected data storage node, the single chunk having been generated to correspond to the target chunk size, the single chunk being a unit for placement and repair.
  - 15. The computing device of claim 9, wherein the computer-program instructions further comprise instructions for:
    - grouping two or more data objects that are smaller than a target chunk size into a unit for placement and repair, and wherein the target chunk size is further based on bounded network bandwidth and brick bandwidth such that reliability of storing data across the multiple data storage nodes using a random data placement scheme is optimized; and
      
      randomly placing the unit for placement and repair on a randomly selected data storage node.
  - 16. The computing device of claim 9, wherein the computer-program instructions further comprise instructions for:
    - collecting multiple data objects from data storage requests;
      
      determining that collected ones of the multiple data objects meet a collective target chunk size criteria;
      
      responsive to the determining, grouping the multiple data objects into a single chunk;
      
      randomly selecting a data storage node and a corresponding stripe of the stripes for data placement; and
      
      storing the single chunk of collected data objects onto the stripe using a replication data storage scheme.

17. In a distributed system for storing data across a network to multiple data storage nodes, one or more computer-readable media having encoded thereon computer-program instructions executable by a processor for performing operations comprising:
- determining a bounded bandwidth available for data repair in the distributed system;
  
  creating a specific number of stripes on each data storage node of the multiple data storage nodes, the stripes for placement and replication of data objects across respective ones of the data storage nodes, the specific number of stripes on each data storage node being a function of the bounded bandwidth; and
  
  wherein the creating further comprises;
  
  (a) calculating a target chunk size based on a replication degree such that data reliability of storing data across the multiple data storage nodes using a random data placement scheme is optimized, the calculating comprising analyzing data reliability of the distributed system in view of the bounded bandwidth by estimating a mean time to data loss (MTTDL) for an object (MTTDL_obj) of multiple objects as a function of a harmonic sum of MTTDL in each state i of multiple states i, the distributed system comprising the multiple objects, each state i representing a state of the distributed system when i data storage nodes fail and lost replicas on the i data storage nodes have not been repaired; and
  
  (b) allocating disk storage space on each node as a function of the target chunk size.
- View Dependent Claims (18, 19, 20)
- - 18. The one or more computer-readable media of claim 17, wherein the computer-program instructions further comprise instructions for randomly placing a single chunk of collected data objects on a randomly selected data storage node, the single chunk having been generated to correspond to the target chunk size, the single chunk being a unit for placement and repair.
  - 19. The one or more computer-readable media of claim 17, wherein the analyzing further comprises estimating a mean time to repair (MTTR) all failed replicas in a state i (MTTR(i)) of the multiple states i, the MTTR(i) being a function of an amount of data to be repaired in the state i and available bandwidth for repair in the state i.
  - 20. The one or more computer-readable media of claim 19, wherein the amount of data to be repaired in the state i is based on an amount of data in a last failed computing device of the i data storage nodes, and an amount of un-repaired data left on the last failed computing device from a previous state i.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Zhang, Zheng, Chen, Wei, Lian, Qiao
Primary Examiner(s)
Sparks; Donald
Assistant Examiner(s)
GU, SHAWN X

Application Number

US11/137,754
Publication Number

US 20060271547A1
Time in Patent Office

1,028 Days
Field of Search

None
US Class Current

711/114
CPC Class Codes

G06F 11/1662 the resynchronized componen...

G06F 16/1844 Management specifically ada...

Cluster storage collection based data management

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Cluster storage collection based data management

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links