Verifying data consistency
First Claim
1. A computer system for verifying data consistency between update-in-place data structures and append-only data structures containing change histories associated with the update-in-place data structures, comprising one or more computer devices each having one or more processors and one or more tangible storage devices:
- and a program embodied on at least one of the one or more storage devices, the program having a plurality of program instructions for execution by the one or more processors, the program instructions comprising instructions for;
loading data, in parallel, from a first update-in-place data structure to a first set of hash buckets and from an append-only data structure to a second set of hash buckets in a processing platform;
performing, in parallel, a bucket-level comparison between the data in the first set of hash buckets and the data in the second set of hash buckets;
identifying transient differences between the first update-in-place data structure and the append-only data structure during the bucket level comparison, wherein the transient differences comprise differences caused by one or more in-flight transactions and one or more rollback transactions committed at the first update-in-place data structure after loading the data from the first update-in-place data structure to the first set of hash buckets in the processing platform;
generating an initial report based on the bucket level comparison, wherein the initial report comprises the identified transient differences between the first update-in-place data structure and the append-only data structure;
removing from the initial report the identified transient differences between the first update-in-place data structure and the append-only data structure, wherein the removing comprises a row-by-row re-fetch from the first update-in-place data structure in an isolation level higher than a cursor stable isolation level; and
generating a final report based on the initial report and removal of the identified transient differences, wherein the final report comprises persistent differences between the first update-in-place data structure and the append-only data structure and omits the identified transient differences removed from the initial report, wherein the final report is generated for live comparison of the first update-in-place data structure and the append-only data structure, and wherein the differences are inserted into a second update-in-place data structure that is associated with the first update-in-place data structure.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for verifying data consistency between update-in-place data structures and append-only data structures containing change histories associated with the update-in-place data structures is provided. The method includes loading data from an update-in-place data structure to a first set of hash buckets in a processing platform, loading data from append-only data structures to a second set of hash buckets in the processing platform, performing a bucket-level comparison between the data in the first set of hash buckets and the data in the second set of has buckets, and generating a report based on the bucket-level comparison.
32 Citations
1 Claim
-
1. A computer system for verifying data consistency between update-in-place data structures and append-only data structures containing change histories associated with the update-in-place data structures, comprising one or more computer devices each having one or more processors and one or more tangible storage devices:
- and a program embodied on at least one of the one or more storage devices, the program having a plurality of program instructions for execution by the one or more processors, the program instructions comprising instructions for;
loading data, in parallel, from a first update-in-place data structure to a first set of hash buckets and from an append-only data structure to a second set of hash buckets in a processing platform; performing, in parallel, a bucket-level comparison between the data in the first set of hash buckets and the data in the second set of hash buckets; identifying transient differences between the first update-in-place data structure and the append-only data structure during the bucket level comparison, wherein the transient differences comprise differences caused by one or more in-flight transactions and one or more rollback transactions committed at the first update-in-place data structure after loading the data from the first update-in-place data structure to the first set of hash buckets in the processing platform; generating an initial report based on the bucket level comparison, wherein the initial report comprises the identified transient differences between the first update-in-place data structure and the append-only data structure; removing from the initial report the identified transient differences between the first update-in-place data structure and the append-only data structure, wherein the removing comprises a row-by-row re-fetch from the first update-in-place data structure in an isolation level higher than a cursor stable isolation level; and generating a final report based on the initial report and removal of the identified transient differences, wherein the final report comprises persistent differences between the first update-in-place data structure and the append-only data structure and omits the identified transient differences removed from the initial report, wherein the final report is generated for live comparison of the first update-in-place data structure and the append-only data structure, and wherein the differences are inserted into a second update-in-place data structure that is associated with the first update-in-place data structure.
- and a program embodied on at least one of the one or more storage devices, the program having a plurality of program instructions for execution by the one or more processors, the program instructions comprising instructions for;
Specification