System and method for generating a target database from one or more source databases
First Claim
1. A software system comprising:
- a. a plurality of source databases, each source database comprising;
1. a plurality of source audience member records, and 2. a plurality of source fields for each source audience member record;
b. a target database comprising;
1. a plurality of target audience member records, at least some of which identify the same audience member identified by at least one source database audience member record, and 2. a plurality of target fields for each target audience member record;
c. means for mapping the plurality of target database fields to corresponding source fields of the plurality of source databases, and, for each such mapping, means for ranking a relative priority of the mapping;
d. means for specifying a ranking for each of the plurality of target fields relative to the rankings of the field'"'"'s mappings to the plurality of source databases;
e. means for selecting from at least one of the plurality of source databases, a source audience member record that matches a target audience member record;
f. means for updating the fields in the target matching record of the target database from multiple mapped fields in the plurality of source databases, including selecting, from among the source database fields to which the target database fields are mapped and the target database field itself, the highest ranked priority fields.
9 Assignments
0 Petitions
Accused Products
Abstract
A software system and method is disclosed for creating and maintaining an audience member database based on multiple source audience member databases. Rankings of relative accuracy of data elements of each source database are maintained. To determine whether a given source database record represents the same audience member represented by a record of the target database, comparisons are made of various non-encoded and encoded fields of the respective databases to identify the most closely-matching target database record. If such a record is identified, only those fields from the source records that have a higher accuracy ranking than the fields in the target database are updated. The target database itself may be directly editable and its fields may be associated with accuracy rankings. Optionally, the user may select parameters to specify how closely candidate match records must match in order to achieve the optimal balance of speed vs. accuracy.
194 Citations
13 Claims
-
1. A software system comprising:
-
a. a plurality of source databases, each source database comprising;
1. a plurality of source audience member records, and 2. a plurality of source fields for each source audience member record;
b. a target database comprising;
1. a plurality of target audience member records, at least some of which identify the same audience member identified by at least one source database audience member record, and 2. a plurality of target fields for each target audience member record;
c. means for mapping the plurality of target database fields to corresponding source fields of the plurality of source databases, and, for each such mapping, means for ranking a relative priority of the mapping;
d. means for specifying a ranking for each of the plurality of target fields relative to the rankings of the field'"'"'s mappings to the plurality of source databases;
e. means for selecting from at least one of the plurality of source databases, a source audience member record that matches a target audience member record;
f. means for updating the fields in the target matching record of the target database from multiple mapped fields in the plurality of source databases, including selecting, from among the source database fields to which the target database fields are mapped and the target database field itself, the highest ranked priority fields. - View Dependent Claims (2, 3, 4, 5)
each source audience member record and each target audience member record includes a primary key. -
5. The system of claim 4, wherein,
each source audience member record includes a last updated time/date field.
-
-
6. A method for generating, for a source audience member record having at least first and second fields, a set of matching candidate audience member records having multiple fields from a target audience member database, wherein there is no pre-defined relationship between the source audience member record and any record of the target audience member database records, comprising the steps of:
-
a. providing first and second indices to the target audience member database based on at least first and second fields of the target database;
b. specifying a match-closeness parameter;
c. generating multiple references to records of the target audience member database by querying the first index for similarities based on the first field of the source audience member record, the quantity of multiple references being responsive to the match-closeness parameter; and
d. generating additional multiple references to records of the target audience member database by querying the second index for similarities based on the second field of the source audience member record, the quantity of additional multiple references being responsive to the match-closeness parameter. - View Dependent Claims (7, 8)
e. selecting one of the references to records of the target audience member database; and
f. updating at least one field from the selected target audience member database record by replacing its data with data from the source audience member record.
-
-
9. A method for updating a target record having at least two fields of an audience member database that identifies the same audience member of a source audience member record having at least two fields, wherein there is no pre-defined relationship between the source audience member record and any record of the target audience member database, comprising the steps of:
-
a. providing at least one non-encoded field index to the target audience member database based on at least a first field of the database;
b. providing at least one encoded field index to the target audience member database based on a second field of the database;
c. querying each of the at least one non-encoded indices for matches to a field of the source audience member record, and storing references to target audience member database records having matching value fields in the set of matching candidate audience member records;
d. encoding a value of a field of the source audience member record that corresponds to an encoded index of the target audience member database;
e. querying the at least one encoded field index for matches to the encoded field value of the source audience member record, and storing references to target audience member database records having matching encoded value fields in the set of matching candidate audience member records;
f. selecting a single references to a target audience member database records; and
g. replacing at least one field in the record of the target audience member database that matches the selected reference, with the at least one corresponding field of the selected source audience member record. - View Dependent Claims (10, 11)
specifying a match-closeness parameter; and
wherein the number of references to records of the target audience member database generated in by the querying steps is responsive to the match-closeness parameter.
-
-
11. The method of claim 9 further comprising the step of:
scoring each of the target audience member database records referred to by the stored references.
-
12. A method for selecting one of a subset of target audience member records and updating at least one field of the selected record with information from at least one field of a source audience member record, comprising the steps of:
-
a. mapping a plurality of fields from the source audience member record to corresponding fields in the subset of target audience member records, each of the corresponding fields having textual content;
b. for each record of the subset of target audience member records;
1. and for each mapped field of each target audience member records;
A. comparing the field to the corresponding field of the source audience member record by a loose matching algorithm;
B. assigning a match score value to the mapped field of the target audience member record based on the comparison;
2. aggregating the plurality of match scores for the mapped fields for the target audience member record;
d. selecting the target audience member record having the highest aggregated match score value; and
e. updating a plurality of fields in the selected target audience member record with information from corresponding fields from the source audience member record. - View Dependent Claims (13)
-
Specification