×

Data comparison system

  • US 9,122,732 B2
  • Filed: 07/20/2010
  • Issued: 09/01/2015
  • Est. Priority Date: 08/06/2009
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method for performing a tolerance based comparison between a legacy data store and a new data store, the method comprising:

  • receiving, by a processor from a device of a user, a compare data structure comprising a plurality of data item pairs, each data item pair identifying a legacy data item of a legacy dataset and a new data item of a new dataset,wherein each data item pair comprises a data type of a table data type, a flat structure data type, a deep structure data type, or a field data type, andwherein the table data type, the flat structure data type and the deep structure data type each comprise a plurality of records;

    receiving, by the processor from the device of the user, a plurality of tolerances, each tolerance being associated with one of the data item pairs and indicative of an acceptable difference between the data item pair according to the data type of the data item pair;

    recursively comparing, by the processor, each data item pair of the plurality of data item pairs wherein recursively comparing comprises determining the data type of each data item pair, and;

    when the data item pair is a determined to be the table data type, calling a compare subroutine for each record in each table of the data item pair to form new data item pairs to compare;

    when the data item pair is determined to be the flat structure data type or the deep structure data type, calling the compare subroutine for each record in the flat structure data or the deep structure of the data item pair to form new data item pairs to compare;

    when the data item pair is determined to not be one of the table data type, the flat structure data type, the deep structure data type, or the field data type, writing a log entry indicating that the data item pair is an unknown data type;

    determining, by the processor, that each of one or more of the plurality of data item pairs being compared comprises the field data type;

    identifying a subset of the plurality of data item pairs comprising the one or more of the plurality of data item pairs determined to be of the field data type; and

    for each of the one or more of the plurality of data item pairs determined to be of the subset of the plurality of data item pairs;

    checking, by the processor, each legacy data item in relation to each new data item of each data item pair in accordance with the associated tolerance; and

    assigning, by the processor, a category among a plurality of categories for each data item pair determined to be of the subset based on the difference of each data item pair within the tolerance associated with each data item pair, wherein the plurality of categories comprisesan exact match category,a within tolerance category, andan outside of tolerance category;

    transforming, by the processor, a result of the checking and assigning into a report, wherein the report describesa percentage of the data item pairs assigned the exact match category,a percentage of the data item pairs assigned the within tolerance category, anda percentage of the data item pairs assigned the outside of tolerance category; and

    providing, by the processor to the device of the user, the report.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×