Data synchronization and consistency across distributed repositories
First Claim
Patent Images
1. A method comprising:
- retrieving a first hash value of a first root node in a first tree structure associated with a first secondary repository, wherein the first secondary repository comprises a first machine-readable storage medium encoded with the first tree structure, wherein the first root node is associated with a first plurality of data units and comprises a first plurality of hash values, wherein the first plurality of hash values were computed from the first plurality of data units, wherein the first hash value is computed based on the first plurality of hash values and a second plurality of hash values of child nodes of the first root node;
retrieving a second hash value of a second root node in a second tree structure associated with a primary repository, wherein the primary repository comprises a second machine-readable storage medium encoded with the second tree structure, wherein the second root node is associated with a second plurality of data units and comprises a third plurality of hash values computed from the second plurality of data units, wherein the second hash value is computed based on the third plurality of hash values and a fourth plurality of hash values of child nodes of the second root node;
comparing the first hash value and the second hash value;
determining that the first hash value is not equal to the second hash value;
comparing the first plurality of hash values against respective ones of the third plurality hash values;
for each hash value of the first plurality of hash values that does not match a respective one of the third plurality of hash values, modifying a data unit of the first plurality of data units, which was used to compute the hash value of the first plurality of hash values, in the first secondary repository to be synchronized with a data unit of the second plurality of data units, which was used to compute the respective one of the third plurality of hash values, of the primary repository;
retrieving a first of the second plurality of hash values of a first child node of the first root node in the first tree structure and a first of the fourth plurality of hash values of a second child node of the second root node in the second tree structure, wherein the first child node is associated with a third plurality of data units and the second plurality of hash values were computed from at least the third plurality of data units, and the second child node is associated with a fourth plurality of data units and the fourth plurality of hash values were computed from at least the forth plurality of data units;
comparing the first of the second plurality of hash values and the first of the fourth plurality of hash values;
determining whether the first of the second plurality of hash values and the first of the fourth plurality of hash values are not equal;
comparing a fifth plurality of hash values computed from the third plurality of data units with respective ones of a sixth plurality of hash values computed from the fourth plurality of data units to detect inconsistencies responsive to determining that the first of the second plurality of hash values and the first of the fourth plurality of hash values are not equal;
synchronizing the third plurality of data units to be consistent with the fourth plurality of data units in accordance with inconsistencies detected from said comparing the fifth plurality of hash values with respective ones of the sixth plurality of hash values; and
indicating that the primary repository and the first secondary repository are synchronized after determining others of the second plurality of hash values are equal to respective ones of the fourth plurality of hash values.
1 Assignment
0 Petitions
Accused Products
Abstract
Data associated with the services in a service oriented architecture are stored in a primary repository and replicated across secondary repositories. Functionality can be implemented to efficiently synchronize data across the primary repository and the secondary repositories. Data synchronization can comprise calculating and comparing hash values of one or more nodes, based in part on concatenated hash values of child nodes and data that comprise the one or more nodes, of a tree structure representing data stored in the repositories.
-
Citations
17 Claims
-
1. A method comprising:
-
retrieving a first hash value of a first root node in a first tree structure associated with a first secondary repository, wherein the first secondary repository comprises a first machine-readable storage medium encoded with the first tree structure, wherein the first root node is associated with a first plurality of data units and comprises a first plurality of hash values, wherein the first plurality of hash values were computed from the first plurality of data units, wherein the first hash value is computed based on the first plurality of hash values and a second plurality of hash values of child nodes of the first root node; retrieving a second hash value of a second root node in a second tree structure associated with a primary repository, wherein the primary repository comprises a second machine-readable storage medium encoded with the second tree structure, wherein the second root node is associated with a second plurality of data units and comprises a third plurality of hash values computed from the second plurality of data units, wherein the second hash value is computed based on the third plurality of hash values and a fourth plurality of hash values of child nodes of the second root node; comparing the first hash value and the second hash value; determining that the first hash value is not equal to the second hash value; comparing the first plurality of hash values against respective ones of the third plurality hash values; for each hash value of the first plurality of hash values that does not match a respective one of the third plurality of hash values, modifying a data unit of the first plurality of data units, which was used to compute the hash value of the first plurality of hash values, in the first secondary repository to be synchronized with a data unit of the second plurality of data units, which was used to compute the respective one of the third plurality of hash values, of the primary repository; retrieving a first of the second plurality of hash values of a first child node of the first root node in the first tree structure and a first of the fourth plurality of hash values of a second child node of the second root node in the second tree structure, wherein the first child node is associated with a third plurality of data units and the second plurality of hash values were computed from at least the third plurality of data units, and the second child node is associated with a fourth plurality of data units and the fourth plurality of hash values were computed from at least the forth plurality of data units; comparing the first of the second plurality of hash values and the first of the fourth plurality of hash values; determining whether the first of the second plurality of hash values and the first of the fourth plurality of hash values are not equal; comparing a fifth plurality of hash values computed from the third plurality of data units with respective ones of a sixth plurality of hash values computed from the fourth plurality of data units to detect inconsistencies responsive to determining that the first of the second plurality of hash values and the first of the fourth plurality of hash values are not equal; synchronizing the third plurality of data units to be consistent with the fourth plurality of data units in accordance with inconsistencies detected from said comparing the fifth plurality of hash values with respective ones of the sixth plurality of hash values; and indicating that the primary repository and the first secondary repository are synchronized after determining others of the second plurality of hash values are equal to respective ones of the fourth plurality of hash values. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A computer program product comprising a non-transitory computer readable storage medium having computer readable code embodied therewith, said computer readable code configured to:
-
retrieve a first hash value of a first root node in a first tree structure associated with a first secondary repository, wherein the first root node is associated with a first plurality of data units and comprises a first plurality of hash values, wherein the first plurality of hash values were computed from the first plurality of data units, wherein the first hash value is computed based on the first plurality of hash values and a second plurality of hash values of child nodes of the first root node; retrieve a second hash value of a second root node in a second tree structure associated with a primary repository, wherein the second root node is associated with a second plurality of data units and comprises a third plurality of hash values computed from the second plurality of data units, wherein the second hash value is computed based on the third plurality of hash values and a fourth plurality of hash values of child nodes of the second root node; compare the first hash value and the second hash value; determine that the first hash value is not equal to the second hash value; compare the first plurality of hash values against respective ones of the third plurality hash values; for each hash value of the first plurality of hash values that does not match a respective one of the third plurality of hash values, cause modification of a data unit of the first plurality of data units, which was used to compute the hash value of the first plurality of hash values, in the first secondary repository to be synchronized with a data unit of the second plurality of data units, which was used to compute the respective one of the third plurality of hash values, of the primary repository; retrieve a first of the second plurality of hash values of a first child node of the first root node in the first tree structure and a first of the fourth plurality of hash values of a second child node of the second root node in the second tree structure, wherein the first child node is associated with a third plurality of data units and the second plurality of hash values were computed from at least the third plurality of data units, and the second child node is associated with a fourth plurality of data units and the fourth plurality of hash values were computed from at least the forth plurality of data units; compare the first of the second plurality of hash values and the first of the fourth plurality of hash values; determine whether the first of the second plurality of hash values and the first of the fourth plurality of hash values are not equal; compare a fifth plurality of hash values computed from the third plurality of data units with respective ones of a sixth plurality of hash values computed from the fourth plurality of data units to detect inconsistencies responsive to the computer readable code determining that the first of the second plurality of hash values and the first of the fourth plurality of hash values are not equal; cause synchronization of the third plurality of data units to be consistent with the fourth plurality of data units in accordance with inconsistencies detected from the computer readable code comparing the fifth plurality of hash values with respective ones of the sixth plurality of hash values; and indicate that the primary repository and the first secondary repository are synchronized after determining others of the second plurality of hash values are equal to respective ones of the fourth plurality of hash values. - View Dependent Claims (12, 13, 14, 15)
-
-
16. An apparatus comprising:
-
a processor unit; a memory unit coupled to the processor unit; and a synchronization unit configured to retrieve a first hash value of a first root node in a first tree structure associated with a secondary repository, wherein the first root node is associated with a first plurality of data units and comprises a first plurality of hash values, wherein the first plurality of hash values were computed from the first plurality of data units, wherein the first hash value is computed based on the first plurality of hash values and a second plurality of hash values of child nodes of the first root node; retrieve a second hash value of a second root node in a second tree structure associated with a primary repository, wherein the second root node is associated with a second plurality of data units and comprises a third plurality of hash values computed from the second plurality of data units, wherein the second hash value is computed based on the third plurality of hash values and a fourth plurality of hash values of child nodes of the second root node; compare the first hash value and the second hash value; determine that the first hash value is not equal to the second hash value; compare the first plurality of hash values against respective ones of the third plurality hash values; for each hash value of the first plurality of hash values that does not match a respective one of the third plurality of hash values, modify a data unit of the first plurality of data units, which was used to compute the hash value of the first plurality of hash values, in the first secondary repository to be synchronized with a data unit the first of the second plurality of data units, which was used to compute the respective one of the third plurality of hash values, of the primary repository; retrieve a first of the second plurality of hash values of a first child node of the first root node in the first tree structure and a first of the fourth plurality of hash values of a second child node of the second root node in the second tree structure, wherein the first child node is associated with a third plurality of data units and the second plurality of hash values were computed from at least the third plurality of data units, and the second child node is associated with a fourth plurality of data units and the fourth plurality of hash values were computed from at least the forth plurality of data units; compare the first of the second plurality of hash values and the first of the fourth plurality of hash values; determine whether the first of the second plurality of hash values and the first of the fourth plurality of hash values are not equal; compare a fifth plurality of hash values computed from the third plurality of data units with respective ones of a sixth plurality of hash values computed from the fourth plurality of data units to detect inconsistencies responsive to determining that the first of the second plurality of hash values and the first of the fourth plurality of hash values are not equal; synchronize the third plurality of data units to be consistent with the fourth plurality of data units in accordance with inconsistencies detected from said comparing the fifth plurality of hash values with respective ones of the sixth plurality of hash values; and indicate that the primary repository and the first secondary repository are synchronized after determining others of the second plurality of hash values are equal to respective ones of the fourth plurality of hash values. - View Dependent Claims (17)
-
Specification