Managing an Archive for Approximate String Matching
First Claim
1. A method for managing an archive for determining approximate matches associated with strings occurring in records, the method including:
- processing records to determine a set of string representations that correspond to strings occurring in the records;
generating, for each of at least some of the string representations in the set, a plurality of close representations that are each generated from at least some of the same characters in the string; and
storing entries in an archive that each represent a potential approximate match between at least two strings based on their respective close representations.
4 Assignments
0 Petitions
Accused Products
Abstract
In one aspect, in general, a method is described for managing an archive for determining approximate matches associated with strings occurring in records. The method includes: processing records to determine a set of string representations that correspond to strings occurring in the records; generating, for each of at least some of the string representations in the set, a plurality of close representations that are each generated from at least some of the same characters in the string; and storing entries in the archive that each represent a potential approximate match between at least two strings based on their respective close representations.
-
Citations
30 Claims
-
1. A method for managing an archive for determining approximate matches associated with strings occurring in records, the method including:
-
processing records to determine a set of string representations that correspond to strings occurring in the records; generating, for each of at least some of the string representations in the set, a plurality of close representations that are each generated from at least some of the same characters in the string; and storing entries in an archive that each represent a potential approximate match between at least two strings based on their respective close representations. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 25, 26, 27, 28, 29, 30)
-
-
22. A computer program, stored on a computer-readable medium, for managing an archive for determining approximate matches associated with strings occurring in records, the computer program including instructions for causing a computer to:
-
process records to determine a set of string representations that correspond to strings occurring in the records; generate, for each of at least some of the string representations in the set, a plurality of close representations that are each generated from at least some of the same characters in the string; and store entries in an archive that each represent a potential approximate match between at least two strings based on their respective close representations.
-
-
23. A system for managing an archive for determining approximate matches associated with strings occurring in records, the system including:
-
means for processing records to determine a set of string representations that correspond to strings occurring in the records; means for generating, for each of at least some of the string representations in the set, a plurality of close representations that are each generated from at least some of the same characters in the string; and means for storing entries in an archive that each represent a potential approximate match between at least two strings based on their respective close representations.
-
-
24. A system for managing an archive for determining approximate matches associated with strings occurring in records, the system including:
-
a data source storing records; a computer system configured to process the records in the data source to determine a set of string representations that correspond to strings occurring in the records, and generate, for each of at least some of the string representations in the set, a plurality of close representations that are each generated from at least some of the same characters in the string; and a data store coupled to the computer system to store an archive including entries that each represent a potential approximate match between at least two strings based on their respective close representations.
-
Specification