×

Managing an archive for approximate string matching

  • US 9,563,721 B2
  • Filed: 07/07/2014
  • Issued: 02/07/2017
  • Est. Priority Date: 01/16/2008
  • Status: Active Grant
First Claim
Patent Images

1. A method for managing an archive for determining approximate matches associated with strings occurring in records, the method including:

  • processing records to determine a set of string representations that correspond to strings occurring in the records;

    for each of at least some of the string representations in the set,generating a plurality of close representations for that string representation, wherein each of the plurality of close representations is generated from at least some of the same characters in at least one of the strings occurring in at least one of the records processed to determine that string representation;

    comparing first close representations that are each generated from at least some characters in at least a first one of the strings occurring in at least one of the records processed to determine a first one of the string representations to second close representations that are each generated from at least some characters in at least a second one of the strings occurring in at least one of the records processed to determine a second one of the string representations, wherein the first close representations are for the first one of the string representations, and wherein the second close representations are for the second one of the string representations;

    identifying which one of the first close representations that are each generated from at least some characters in at least the first one of the strings occurring in at least one of the records processed to determine the first one of the string representations corresponds to which one of the second close representations that are each generated from at least some characters in at least the second one of the strings occurring in at least one of the records processed to determine the second one of the string representations; and

    based on identified correspondences between close representations, storing entries in an archive that each represent a potential approximate match between at least two strings based on their respective close representations.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×