FACILITY FOR RECONCILIATION OF BUSINESS RECORDS USING GENETIC ALGORITHMS
First Claim
1. A computer-implemented method of matching and reconciling business records using fitness functions, the computer-implemented method comprising:
- for each field in a first business record that is to be compared with a corresponding field in a second business record, selecting a production fitness function to be applied to compare the contents of the field in the first and second business records and a weight that is to be applied to the result of the applied production fitness function by;
applying test fitness functions to fields contained in a plurality of test business records to determine fitness function results, weighting each fitness function result by an associated weight, and calculating an overall performance of the test fitness functions based on the weighted fitness function results across the plurality of test business records;
repeating the application of test fitness functions until an overall performance of one or more weighted test fitness functions exceeds a first confidence threshold, wherein the test fitness functions and associated weights are modified for each repeat application; and
selecting at least one of the test fitness functions and weights that exceed the first confidence threshold as production fitness functions and weights;
applying the selected production fitness functions for each field in the first business record that is to be compared with a corresponding field in the second business record;
weighting the results of the applied production fitness functions to calculate a confidence level that the first business record and the second business record are associated with the same business entity; and
combining the first business record and the second business record into an authoritative business record if the calculated confidence level exceeds a second confidence threshold.
1 Assignment
0 Petitions
Accused Products
Abstract
A facility for the reconciliation of data records pertaining to business entities. One or more fitness functions are applied to fields contained in two conflicting data records to assess the similarity of each field. The results of the fitness functions are then weighted and combined to assess the likelihood that the two data records are associated with the same business entity. When the weighted fitness functions are applied to conflicting data records, the fitness functions generate a confidence level that the compared records are associated with the same business entity. If the confidence level exceeds a certain threshold, the facility accepts that the data records refer to the same business entity and synthesizes a business record from the data records.
-
Citations
44 Claims
-
1. A computer-implemented method of matching and reconciling business records using fitness functions, the computer-implemented method comprising:
-
for each field in a first business record that is to be compared with a corresponding field in a second business record, selecting a production fitness function to be applied to compare the contents of the field in the first and second business records and a weight that is to be applied to the result of the applied production fitness function by; applying test fitness functions to fields contained in a plurality of test business records to determine fitness function results, weighting each fitness function result by an associated weight, and calculating an overall performance of the test fitness functions based on the weighted fitness function results across the plurality of test business records; repeating the application of test fitness functions until an overall performance of one or more weighted test fitness functions exceeds a first confidence threshold, wherein the test fitness functions and associated weights are modified for each repeat application; and selecting at least one of the test fitness functions and weights that exceed the first confidence threshold as production fitness functions and weights; applying the selected production fitness functions for each field in the first business record that is to be compared with a corresponding field in the second business record; weighting the results of the applied production fitness functions to calculate a confidence level that the first business record and the second business record are associated with the same business entity; and combining the first business record and the second business record into an authoritative business record if the calculated confidence level exceeds a second confidence threshold. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A system for matching and reconciling business records using fitness functions, the system comprising:
-
a genetic algorithm module that, for each field in a first business record that is to be compared with a corresponding field in a second business record, selects a production fitness function to be applied to compare the contents of the field in the first and second business records and a weight that is to be applied to the result of the applied production fitness function by; applying test fitness functions to fields contained in a plurality of test business records to determine fitness function results, weighting each fitness function result by an associated weight, and calculating an overall performance of the test fitness functions based on the weighted fitness function results across the plurality of test business records; repeating the application of test fitness functions until an overall performance of one or more weighted test fitness functions exceeds a first confidence threshold, wherein the test fitness functions and associated weights are modified for each repeat application; and selecting at least one of the test fitness functions and weights that exceed the first confidence threshold as production fitness functions and weights; a record matching module that applies the production fitness functions selected by the genetic algorithm module to each field in the first business record that is to be compared with a corresponding field in the second business record, and weights the results of the applied production fitness functions using the weights selected by the genetic algorithm module to calculate a confidence level that the first business record and the second business record are associated with the same business entity; and a field selection module that merges the first business record and the second business record into an authoritative business record if the confidence level calculated by the record matching module exceeds a second confidence threshold. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21)
-
-
22. A computer-implemented method of matching and reconciling business records using fitness functions, the computer-implemented method comprising:
-
for each field or concatenated fields in a first business record that are to be compared with a corresponding field or concatenated fields in a second business record, selecting a production fitness function to be applied to compare the contents of the field or concatenated fields in the first and second business records and a weight that is to be applied to the result of the applied production fitness function by; repeatedly applying test fitness functions to fields or concatenated fields contained in a plurality of test business records and weighting the results until an overall performance of one or more weighted test fitness functions exceeds a first confidence threshold; and selecting at least one of the test fitness functions and weights that exceed the first confidence threshold as production fitness functions and weights; applying the selected production fitness functions to each field or concatenated fields in the first business record that are to be compared with a corresponding field or concatenated fields in the second business record; weighting the results of the applied production fitness functions to calculate a confidence level that the first business record and the second business record are associated with the same business entity; and flagging that the first business record and the second business record should be combined into an authoritative business record if the calculated confidence level exceeds a second confidence threshold. - View Dependent Claims (23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33)
-
-
34. A system for matching and reconciling business records using fitness functions, the system comprising:
-
a genetic algorithm module that, for each field or concatenated fields in a first business record that are to be compared with a corresponding field or concatenated fields in a second business record, selects a production fitness function to be applied to compare the contents of the field or concatenated fields in the first and second business records and a weight that is to be applied to the result of the applied production fitness function by; repeatedly applying test fitness functions to fields or concatenated fields contained in a plurality of test business records and weighting the results until an overall performance of one or more weighted test fitness functions exceeds a first confidence threshold; and selecting at least one of the test fitness functions and weights that exceed the first confidence threshold as production fitness functions and weights; and a record matching module that; applies the production fitness functions selected by the genetic algorithm module to each field or concatenated fields in the first business record that are to be compared with a corresponding field or concatenated fields in the second business record; weights the results of the applied production fitness functions to calculate a confidence level that the first business record and the second business record are associated with the same business entity; and if the calculated confidence level exceeds a second confidence threshold, flags that the first business record and the second business record should be combined into an authoritative business record. - View Dependent Claims (35, 36, 37, 38, 39, 40, 41, 42, 43, 44)
-
Specification