Method and system for implementing parallel transformations of records
First Claim
1. A computer implemented method for transforming data records, comprising:
- using at least one processor that executes one or more processing entities and is configured or programmed for performing a process, the process comprising;
identifying a data unit;
generating a first record and a second record to place into the data unit;
generating multiple checksums in parallel for the first record and the second record, wherein after generating the first and second records but before respective placement of the first and second records in the data unit;
generating a first checksum from the first record, and a second checksum from the second record, in which the first and second checksums are generated in parallel for the first and second records to be stored within the data unit;
storing the first and second checksums in a holding data structure configured to hold a plurality of checksums;
persistently storing the first record and the second record in the data unit; and
combining the multiple checksums into a combined checksum, wherein the combined checksum corresponds to an aggregate checksum for the data unit that comprises at least the first checksum and the second checksum from the holding data structure, and wherein the first record and the second record are persistently stored in the data unit with the aggregate checksum for the data unit.
1 Assignment
0 Petitions
Accused Products
Abstract
An improved approach is described for implementing transformations of data records in high concurrency environments. Each transformation is performed in parallel at the source when the data record is first generated. According to one approach for data integrity validation, record generators compute an integrity checksum for a newly generated record before copying into a data unit in shared memory. Subsequent generators may aggregate integrity checksums for data records into checksums for data units incrementally. This approach achieves end-to-end protection of data records against corruption using an efficient method of maintaining verifiable data integrity. In another approach, compression and encryption data transformations may be performed by themselves, or in combination with an integrity checksum transformation.
13 Citations
42 Claims
-
1. A computer implemented method for transforming data records, comprising:
-
using at least one processor that executes one or more processing entities and is configured or programmed for performing a process, the process comprising; identifying a data unit; generating a first record and a second record to place into the data unit; generating multiple checksums in parallel for the first record and the second record, wherein after generating the first and second records but before respective placement of the first and second records in the data unit; generating a first checksum from the first record, and a second checksum from the second record, in which the first and second checksums are generated in parallel for the first and second records to be stored within the data unit; storing the first and second checksums in a holding data structure configured to hold a plurality of checksums; persistently storing the first record and the second record in the data unit; and combining the multiple checksums into a combined checksum, wherein the combined checksum corresponds to an aggregate checksum for the data unit that comprises at least the first checksum and the second checksum from the holding data structure, and wherein the first record and the second record are persistently stored in the data unit with the aggregate checksum for the data unit. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A system for transforming data records, comprising:
-
at least one processor that is at least to; identify a data unit; generate a first record and a second record to place into the data unit; generating multiple checksums in parallel for the first record and the second record, wherein after the first and second records are generated but before respective placement of the first and second records in the data unit; generate a first checksum from the first record, and a second checksum from the second record, in which the first and second checksums are generated in parallel for the first and second records to be stored within the data unit; store the first and second checksums in a holding data structure configured to hold a plurality of checksums; persistently store the first record and the second record in the data unit; and combining the multiple checksums into a combined checksum, wherein the combined checksum corresponds to an aggregate checksum for the data unit that comprises at least the first checksum and the second checksum from the holding data structure, and wherein the first record and the second record are persistently stored in the data unit with the aggregate checksum for the data unit. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29)
-
-
30. A computer program product that includes a non-transitory computer readable storage medium, the non-transitory computer readable storage medium comprising a plurality of computer instructions which, when executed by at least one processor, cause the at least one processor to perform a method for transforming data records, the method comprising:
-
using the at least one processor that is configured or programmed to implement a process, the process comprising; identifying a data unit; generating a first record and a second record to place into the data unit; generating multiple checksums in parallel for the first record and the second record, wherein after generating the first and second records but before respective placement of the first and second records in the data unit; generating a first checksum from the first record, and a second checksum from the second record, in which the first and second checksums are generated in parallel for the first and second records to be stored within the data unit; storing the first and second checksums in a holding data structure configured to hold a plurality of checksums; persistently storing the first record and the second record in the data unit; and combining the multiple checksums into a combined checksum, wherein the combined checksum corresponds to an aggregate checksum for the data unit that comprises at least the first checksum and the second checksum from the holding data structure, and wherein the first record and the second record are persistently stored in the data unit with the aggregate checksum for the data unit. - View Dependent Claims (31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42)
-
Specification