Fast database matching
First Claim
1. A method of identifying a possible match between a sample data record and any in a plurality of enrolled data records in a data base, each enrolled data record comprising a first plurality of data positions, the method comprising:
- a) prior to initiating a process for matching a sample data record separate from the data base with one of the enrolled data records in the data base, without first associating the sample data record with one of the enrolled data records, designating a second plurality of reference positions among the first plurality of data positions in a first enrolled data record, some of which reference positions in said first enrolled data record are separated by other data positions in said first enrolled data record, each reference position corresponding to a location in said first enrolled data record at which a key value, useful as a characteristic feature for identifying said first enrolled data record, is positioned, there being a first key value at a first enrolled data record reference position and a second key value at a second enrolled data record reference position, the totality of said key values providing an identification for distinguishing said first enrolled data record from others in the plurality of enrolled data records;
b) providing, for at least said first enrolled data record, an enrollment mask comprising a series of enrollment mask data positions, each corresponding to one in the first plurality of data positions in said first enrolled data record, the enrollment mask including at least first and second enrollment mask reference positions corresponding to first and second enrolled data record reference positions, wherein the first key value is associated with said first enrollment mask reference position and the second key value is associated with said second enrollment mask reference position to match a sample record with said first enrolled data record;
c) for the sample data record, defining a sample mask comprising sample mask data positions, each corresponding to a data position in said first enrolled data record, including first and second sample mask reference positions corresponding to said first and second enrollment mask reference positions and corresponding to the first and second reference positions in the enrolled data record reference positions;
d) associating said first key value with said first sample mask reference position and associating said second key value with said second mask reference position to identify in the sample record presence of at least said first and second key values at positions corresponding to reference positions in said first enrolled data record that are associated with said first and second key values; and
e) applying the sample mask to the sample data record to determine whether the first and second key values are at positions in the sample data record corresponding to the first and second sample mask reference positions to identify a possible match between the sample data record and said first enrolled data record.
3 Assignments
0 Petitions
Accused Products
Abstract
A method of improving the speed with which a sample data record can be matched against records in a database comprises defining a list of possible key values (430), testing those key values against the sample and, for each record in the database, counting the number of key values that match both the record and the sample at reference positions selected by a mask. A list of possible matches is then selected on the basis of that count, for more detailed matching or analysis. Such a method provides very fast matching at the expense of some additional effort when registering a new record within the database.
-
Citations
18 Claims
-
1. A method of identifying a possible match between a sample data record and any in a plurality of enrolled data records in a data base, each enrolled data record comprising a first plurality of data positions, the method comprising:
-
a) prior to initiating a process for matching a sample data record separate from the data base with one of the enrolled data records in the data base, without first associating the sample data record with one of the enrolled data records, designating a second plurality of reference positions among the first plurality of data positions in a first enrolled data record, some of which reference positions in said first enrolled data record are separated by other data positions in said first enrolled data record, each reference position corresponding to a location in said first enrolled data record at which a key value, useful as a characteristic feature for identifying said first enrolled data record, is positioned, there being a first key value at a first enrolled data record reference position and a second key value at a second enrolled data record reference position, the totality of said key values providing an identification for distinguishing said first enrolled data record from others in the plurality of enrolled data records; b) providing, for at least said first enrolled data record, an enrollment mask comprising a series of enrollment mask data positions, each corresponding to one in the first plurality of data positions in said first enrolled data record, the enrollment mask including at least first and second enrollment mask reference positions corresponding to first and second enrolled data record reference positions, wherein the first key value is associated with said first enrollment mask reference position and the second key value is associated with said second enrollment mask reference position to match a sample record with said first enrolled data record; c) for the sample data record, defining a sample mask comprising sample mask data positions, each corresponding to a data position in said first enrolled data record, including first and second sample mask reference positions corresponding to said first and second enrollment mask reference positions and corresponding to the first and second reference positions in the enrolled data record reference positions; d) associating said first key value with said first sample mask reference position and associating said second key value with said second mask reference position to identify in the sample record presence of at least said first and second key values at positions corresponding to reference positions in said first enrolled data record that are associated with said first and second key values; and e) applying the sample mask to the sample data record to determine whether the first and second key values are at positions in the sample data record corresponding to the first and second sample mask reference positions to identify a possible match between the sample data record and said first enrolled data record. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A system for identifying possible matches between a sample data record and a plurality of enrolled data records, the system comprising:
-
a processor; and memory storing instructions which, when executed by the processor, cause the processor to perform the steps of; a) prior to initiating a process for matching a sample data record separate from the data base with one of the enrolled data records in the data base, without first associating the sample data record with one of the enrolled data records, designating a second plurality of reference positions among the first plurality of data positions in a first enrolled data record, some of which reference positions in said first enrolled data record are separated by other data positions in said first enrolled data record, each reference position corresponding to a location in said first enrolled data record at which a key value, useful as a characteristic feature for identifying said first enrolled data record, is positioned, there being a first key value at a first enrolled data record reference position and a second key value at a second enrolled data record reference position, the totality of said key values providing an identification for distinguishing said first enrolled data record from others in the plurality of enrolled data records; b) providing, for at least said first enrolled data record, an enrollment mask comprising a series of enrollment mask data positions, each corresponding to one in the first plurality of data positions in said first enrolled data record, the enrollment mask including at least first and second enrollment mask reference positions corresponding to first and second enrolled data record reference positions, wherein the first key value is associated with said first enrollment mask reference position and the second key value is associated with said second enrollment mask reference position to match a sample record with said first enrolled data record; c) for the sample data record, defining a sample mask comprising sample mask data positions, each corresponding to a data position in said first enrolled data record, including first and second sample mask reference positions corresponding to said first and second enrollment mask reference positions and corresponding to the first and second reference positions in the enrolled data record reference positions; d) associating said first key value with said first sample mask reference position and associating said second key value with said second mask reference position to identify in the sample record presence of at least said first and second key values at positions corresponding to reference positions in said first enrolled data record that are associated with said first and second key values; and e) applying the sample mask to the sample data record to determine whether the first and second key values are at positions in the sample data record corresponding to the first and second sample mask reference positions to identify a possible match between the sample data record and said first enrolled data record. - View Dependent Claims (16, 17, 18)
-
Specification