×

System and Method for Data Driven De-Duplication

  • US 20110184921A1
  • Filed: 09/08/2010
  • Published: 07/28/2011
  • Est. Priority Date: 01/25/2010
  • Status: Active Grant
First Claim
Patent Images

1. A computer implemented method of locating redundancy within data, the method comprising:

  • recording target locations within target data where a summary that identifies a particular pattern within the target data equals a predetermined value;

    recording reference locations within reference data where a summary that identifies the particular pattern within the reference data equals the predetermined value;

    determining a reference set of summaries of the reference data, each member of the reference set of summaries including a plurality of summaries indicative of patterns of reference data located at recorded reference locations;

    determining a target set of summaries of the target data, each member of the target set of summaries including a plurality of summaries indicative of patterns of target data located at recorded target locations; and

    identifying a subset of the reference data that is likely to match a subset of the target data by comparing members of the reference set of summaries to members of the target set of summaries.

View all claims
  • 6 Assignments
Timeline View
Assignment View
    ×
    ×