×

Cluster storage collection based data management

  • US 7,346,734 B2
  • Filed: 05/25/2005
  • Issued: 03/18/2008
  • Est. Priority Date: 05/25/2005
  • Status: Expired due to Fees
First Claim
Patent Images

1. In a distributed system for storing data across a network to multiple data storage nodes, a method comprising:

  • determining a bounded bandwidth available for data repair in the distributed system;

    creating a specific number of stripes on each data storage node of the multiple data storage nodes, the stripes for placement and replication of data objects across respective ones of the data storage nodes, the specific number of stripes on each data storage node being a function of the bounded bandwidth; and

    wherein the creating further comprises;

    (a) calculating a target chunk size based on a replication degree such that data reliability of storing data across the multiple data storage nodes using a random data placement scheme is optimized, the calculating comprising analyzing data reliability of the distributed system in view of the bounded bandwidth by estimating a mean time to data loss (MTTDL) for an object (MTTDLobj) of multiple objects as a function of a harmonic sum of MTTDL in each state i of multiple states i, the distributed system comprising the multiple objects, each state i representing a state of the distributed system when i data storage nodes fail and lost replicas on the i data storage nodes have not been repaired; and

    (b) allocating disk storage space on each node as a function of the target chunk size.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×