×

System and method for providing data driven de-duplication services

  • US 8,447,741 B2
  • Filed: 09/08/2010
  • Issued: 05/21/2013
  • Est. Priority Date: 01/25/2010
  • Status: Active Grant
First Claim
Patent Images

1. A computer implemented method of identifying reference data likely to match target data, the method comprising:

  • reading a reference set of summaries of data included in a reference data set, each member of the reference set of summaries including a plurality of summaries that indicate particular patterns of the reference data within the reference data set;

    comparing the reference set of summaries to a target set of summaries associated with at least one target area of a plurality of target areas, each member of the target set of summaries including a plurality of summaries that indicate particular patterns of the target data included in the at least one target area, the plurality of target areas being included in a target data set;

    associating the at least one target area with the reference data set when a threshold number of members of the target set of summaries associated with the at least one target area match members of the reference set of summaries, wherein the reference data set includes a plurality of reference areas, each reference area of the plurality of reference areas being associated with at least one member of the reference set of summaries; and

    selecting at least one reference area of the plurality of references areas based on a number of members of the target set of summaries associated with the at least one target area that match members of the reference set of summaries associated with the at least one reference area.

View all claims
  • 6 Assignments
Timeline View
Assignment View
    ×
    ×