×

Efficient identification of entire row uniqueness in relational databases

  • US 8,984,301 B2
  • Filed: 06/19/2008
  • Issued: 03/17/2015
  • Est. Priority Date: 06/19/2008
  • Status: Active Grant
First Claim
Patent Images

1. A method for efficiently identifying uniqueness of rows of a relational database, the method comprising:

  • a processor creating a cryptographic sum for each row of one or more rows of a target table of the relational database, wherein the cryptographic sum for a particular row of the one or more rows of the target table is calculated by summing the contents of a selected subset of columns from among all columns in that particular row of the target table and assigning a unique checksum value based on the summed contents of the selected subset of columns in that particular row;

    receiving an incoming record;

    selecting a next row of a plurality of rows of the incoming record;

    the processor determining if the next row contains an incoming cryptographic sum, wherein the incoming cryptographic sum of the next row is calculated by summing contents of a selected subset of columns from among all columns in the next row of the incoming record and assigning a unique checksum value based on the summed contents of the selected subset of columns in the next row, wherein the selected subset of columns comprises a first column containing a medical record and a second column containing a social security number;

    in response to determining that the next row contains the incoming cryptographic sum;

    comparing the incoming cryptographic sum of the next row to the cryptographic sum of each row of the one or more rows of the target table;

    separating the cryptographic sum into a plurality of equally sized blocks; and

    the processor appending the plurality of equally sized blocks of the cryptographic sum of the one or more rows of the target table to a hidden column of the target tablein response to determining the incoming cryptographic sum of the next row is identical to at least one cryptographic sum of the one or more rows of the target table, the processor disregarding the next row when updating the target table;

    in response to determining the incoming cryptographic sum of the next row is not identical to at least one cryptographic sum of the one or more rows of the target table, determining if the next row contains an incoming record ID, wherein the incoming record ID is an identification value of the next row;

    in response to determining that the next row contains the incoming record ID, identifying the incoming record ID for the next row;

    comparing the incoming record ID of the next row with a record ID of the one or more rows of the target table; and

    in response to determining the incoming record ID is identical to at least one record ID of at least one row the one or more rows of the target table, the processor updating contents of the at least one row with contents of the next row;

    in response to determining the incoming record ID is not identical to at least one record ID of at least one row the one or more rows of the target table, and the incoming cryptographic sum of the next row is not identical to at least one cryptographic sum of the one or more rows of the target table, the processor adding the next row as a new row within the target table via a logical instruction; and

    in response to determining the next row does not contain the incoming record ID, the processor adding the next row as a new row within the target table via a logical instruction; and

    in response to determining that the next row does not contain the incoming cryptographic sum, the processor;

    calculating the incoming cryptographic sum for the next row;

    separating the incoming cryptographic sum into a plurality of equally sized blocks; and

    storing the plurality of equally sized blocks of the incoming cryptographic sum for the next row in a hidden column of the next row; and

    iteratively performing, until no additional rows remain in the plurality of rows of the incoming record, the functions of;

    determining if the next row contains an incoming cryptographic sum, comparing the incoming cryptographic sum of the next row to the cryptographic sum of each row of the one or more rows of the target table, and in response to determining the incoming cryptographic sum of the next row is identical to at least one cryptographic sum of the one or more rows of the target table, disregarding the next row when updating the target table.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×