METHOD AND SYSTEM FOR COMPARING ATTRIBUTES SUCH AS BUSINESS NAMES
2 Assignments
0 Petitions
Accused Products
Abstract
Embodiments of systems and methods for comparing attributes of a data record are presented herein. Broadly speaking, embodiments of the present invention generate a weight based on a comparison of the name (or other) attributes of data records. More particularly, embodiments of the present invention generate a weight based on a comparison of name attributes. More specifically, embodiments of the present invention may calculate an information score for each of two name attributes to be compared to get an average information score for the two name attributes. The two name attributes may then be compared against one another to generate a weight between the two attributes. This weight can then be normalized to generate a final weight between the two business name attributes.
-
Citations
22 Claims
-
1. (canceled)
-
2. A method for comparing attributes, comprising:
-
providing a system having an identity hub coupled to a set of data sources and a set of operator computers via a network, wherein the identity hub is configured to store a link between one or more data records in the set of data sources; receiving, at the identity hub, a first attribute comprising a first set of tokens from one of the set of data sources or from one of the set of operators, wherein the first attribute is associated with a first data record in one of the set of data sources and represents a first entity; receiving, at the identity hub, a second attribute comprising a second set of tokens from one of the set of data sources, wherein the second attribute is associated with a second data record in one of the data sources and represents a second entity; generating, at the identity hub, a weight for the first attribute and the second attribute; wherein generating the weight comprises comparing the first set of tokens of the first attribute to the second set of tokens of the second attribute and comparing each pair of tokens comprises; determining, at the identity hub, a current match weight for the pair of tokens; determining, at the identity hub, a first previous match weight corresponding to the pair of tokens; determining, at the identity hub, a second previous match weight corresponding to the set of tokens; setting, at the identity hub, the weight to the current match weight if the current match weight is greater than the first previous match weight or the second previous match weight; setting, at the identity hub, the weight to the greater of the first previous match weight or the second previous match weight if either the first previous match weight or the second previous match weight is greater than the current match weight; and determining if the first data record and the second data record should be linked based at least in part on the weight between the two attributes. - View Dependent Claims (3, 4, 5, 6, 7, 8)
-
-
9. A computer readable medium, comprising instructions executable by a processor for:
-
receiving a first attribute comprising a first set of tokens from one of a set of data sources or from one of a set of operators, wherein the first attribute is associated with a first data record in one of the set of data sources and represents a first entity; receiving a second attribute comprising a second set of tokens from one of the set of data sources, wherein the second attribute is associated with a second data record in one of the data sources and represents a second entity; generating a weight for the first attribute and the second attribute;
wherein generating the weight comprises comparing the first set of tokens of the first attribute to the second set of tokens of the second attribute and comparing each pair of tokens comprises;determining a current match weight for the pair of tokens; determining a first previous match weight corresponding to the pair of tokens; determining a second previous match weight corresponding to the set of tokens; setting the weight to the current match weight if the current match weight is greater than the first previous match weight or the second previous match weight; setting the weight to the greater of the first previous match weight or the second previous match weight if either the first previous match weight or the second previous match weight is greater than the current match weight; and determining if the first data record and the second data record should be linked based at least in part on the weight between the two attributes. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
16. A system for comparing attributes, comprising:
-
a set of operator computers; a set of data sources, each data source storing one or more data records; and an identity hub coupled to the set of data source and the set of operator computers via a network, the identity hub configured to store links between one or more data records in the set of data sources, wherein the identity hub, comprises a tangible computer readable medium comprising instructions executable by a processor for; receiving a first attribute comprising a first set of tokens from one of a set of data sources or from one of a set of operators, wherein the first attribute is associated with a first data record in one of the set of data sources and represents a first entity; receiving a second attribute comprising a second set of tokens from one of the set of data sources, wherein the second attribute is associated with a second data record in one of the data sources and represents a second entity; generating a weight for the first attribute and the second attribute;
wherein generating the weight comprises comparing the first set of tokens of the first attribute to the second set of tokens of the second attribute and comparing each pair of tokens comprises;determining a current match weight for the pair of tokens; determining a first previous match weight corresponding to the pair of tokens; determining a second previous match weight corresponding to the set of tokens; setting the weight to the current match weight if the current match weight is greater than the first previous match weight or the second previous match weight; setting the weight to the greater of the first previous match weight or the second previous match weight if either the first previous match weight or the second previous match weight is greater than the current match weight; and determining if the first data record and the second data record should be linked based at least in part on the weight between the two attributes. - View Dependent Claims (17, 18, 19, 20, 21, 22)
-
Specification