×

Data synchronization and consistency across distributed repositories

  • US 8,260,742 B2
  • Filed: 04/03/2009
  • Issued: 09/04/2012
  • Est. Priority Date: 04/03/2009
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method comprising:

  • retrieving a first hash value of a first root node in a first tree structure associated with a first secondary repository, wherein the first secondary repository comprises a first machine-readable storage medium encoded with the first tree structure, wherein the first root node is associated with a first plurality of data units and comprises a first plurality of hash values, wherein the first plurality of hash values were computed from the first plurality of data units, wherein the first hash value is computed based on the first plurality of hash values and a second plurality of hash values of child nodes of the first root node;

    retrieving a second hash value of a second root node in a second tree structure associated with a primary repository, wherein the primary repository comprises a second machine-readable storage medium encoded with the second tree structure, wherein the second root node is associated with a second plurality of data units and comprises a third plurality of hash values computed from the second plurality of data units, wherein the second hash value is computed based on the third plurality of hash values and a fourth plurality of hash values of child nodes of the second root node;

    comparing the first hash value and the second hash value;

    determining that the first hash value is not equal to the second hash value;

    comparing the first plurality of hash values against respective ones of the third plurality hash values;

    for each hash value of the first plurality of hash values that does not match a respective one of the third plurality of hash values, modifying a data unit of the first plurality of data units, which was used to compute the hash value of the first plurality of hash values, in the first secondary repository to be synchronized with a data unit of the second plurality of data units, which was used to compute the respective one of the third plurality of hash values, of the primary repository;

    retrieving a first of the second plurality of hash values of a first child node of the first root node in the first tree structure and a first of the fourth plurality of hash values of a second child node of the second root node in the second tree structure, wherein the first child node is associated with a third plurality of data units and the second plurality of hash values were computed from at least the third plurality of data units, and the second child node is associated with a fourth plurality of data units and the fourth plurality of hash values were computed from at least the forth plurality of data units;

    comparing the first of the second plurality of hash values and the first of the fourth plurality of hash values;

    determining whether the first of the second plurality of hash values and the first of the fourth plurality of hash values are not equal;

    comparing a fifth plurality of hash values computed from the third plurality of data units with respective ones of a sixth plurality of hash values computed from the fourth plurality of data units to detect inconsistencies responsive to determining that the first of the second plurality of hash values and the first of the fourth plurality of hash values are not equal;

    synchronizing the third plurality of data units to be consistent with the fourth plurality of data units in accordance with inconsistencies detected from said comparing the fifth plurality of hash values with respective ones of the sixth plurality of hash values; and

    indicating that the primary repository and the first secondary repository are synchronized after determining others of the second plurality of hash values are equal to respective ones of the fourth plurality of hash values.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×