Analysis of a system for matching data records
First Claim
1. A computer-implemented method for analyzing a system for matching data records, the method comprising:
- producing a configuration of said system, the configuration of the system applying a bucketing strategy operable to create buckets by comparing sets of one or more attributes of initial data records with corresponding attributes of candidate data records in said system, wherein each bucket is associated with a corresponding set of attributes;
analyzing buckets created according to the bucketing strategy associated with said configuration of said system, wherein said buckets each comprise candidate data records with the corresponding set of attributes similar to those of the initial data records and are used to associate data records with a common entity, and wherein said analyzing said buckets further comprises analyzing statistics associated with said buckets, analyzing a bucket size distribution, analyzing said buckets by size, analyzing said buckets by composition, analyzing a bulk cross match, comparison distribution, analyzing members by bucket count, analyzing member bucket values, analyzing member bucket frequencies, analyzing a member comparison distribution, or a combination thereof;
analyzing an effect of said buckets on performance of said system to determine and link data records associated with a common entity; and
changing said bucketing strategy accordingly to alter determination of the association of data records with the common entity.
2 Assignments
0 Petitions
Accused Products
Abstract
Embodiments disclosed herein provide a system and method for analyzing an identity hub. Particularly, a user can connect to the identity hub, load an initial set of data records, create and/or edit an identity hub configuration locally, analyze and/or validate the configuration via a set of analysis tools, including an entity analysis tool, a data analysis tool, a bucket analysis tool, and a linkage analysis tool, and remotely deploy the validated configuration to an identity hub instance. In some embodiments, through a graphical user interface, these analysis tools enable the user to analyze and modify the configuration of the identity hub in real time while the identity hub is operating to ensure data quality and enhance system performance.
273 Citations
17 Claims
-
1. A computer-implemented method for analyzing a system for matching data records, the method comprising:
-
producing a configuration of said system, the configuration of the system applying a bucketing strategy operable to create buckets by comparing sets of one or more attributes of initial data records with corresponding attributes of candidate data records in said system, wherein each bucket is associated with a corresponding set of attributes; analyzing buckets created according to the bucketing strategy associated with said configuration of said system, wherein said buckets each comprise candidate data records with the corresponding set of attributes similar to those of the initial data records and are used to associate data records with a common entity, and wherein said analyzing said buckets further comprises analyzing statistics associated with said buckets, analyzing a bucket size distribution, analyzing said buckets by size, analyzing said buckets by composition, analyzing a bulk cross match, comparison distribution, analyzing members by bucket count, analyzing member bucket values, analyzing member bucket frequencies, analyzing a member comparison distribution, or a combination thereof; analyzing an effect of said buckets on performance of said system to determine and link data records associated with a common entity; and changing said bucketing strategy accordingly to alter determination of the association of data records with the common entity. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A non-transitory computer readable storage medium storing computer instructions executable by a processor, wherein when executed by said processor said computer instructions cause a computer to:
-
produce a configuration of a system; create buckets according to a bucketing strategy associated with said configuration of said system by comparing sets of one or more attributes of initial data records with corresponding attributes of candidate data records in said system, wherein each bucket is associated with a corresponding set of attributes, and wherein said buckets each comprise candidate data records with the corresponding set of attributes similar to those of the initial data records and are used to associate data records with a common entity; analyze said buckets and an effect of said buckets on performance of said system to determine and link data records associated with a common entity, wherein said analyzing said buckets further comprises analyzing statistics associated with said buckets, analyzing a bucket size distribution, anal ing said buckets by size, analyzing said buckets by composition, analyzing a bulk cross match comparison distribution, analyzing members by bucket count, analyzing member bucket values, analyzing member bucket frequencies, analyzing a member comparison distribution, or a combination thereof; and change said bucketing strategy to alter determination of the association of data records with the common entity. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A computer system for analyzing an identity hub comprising:
-
at least one processor, and at least one computer readable storage medium accessible by said at least one processor and storing computer instructions executable by said at least one processor, wherein when executed by said at least one processor said computer instructions cause said computer system to; display a graphical user interface interfacing a plurality of tools comprising a configuration editor, an algorithm editor, a data analysis tool, an entity analysis tool, a bucket analysis tool, and a linkage analysis tool; wherein said configuration editor creates or loads a configuration of said identity hub locally utilizing an initial set of data records from information sources coupled to said identity hub; wherein said algorithm editor edits an algorithm utilized in creating buckets based on said initial set of data records to alter determination of an association of data records with a common entity, wherein said buckets are created by comparing sets of one or more attributes of the initial data records with corresponding attributes of candidate data records and each bucket is associated with a corresponding set of attributes, and wherein said buckets each comprise candidate data records with the corresponding set of attributes similar to those of the initial data records and are used to associate data records with the common entity; wherein said data analysis tool enables analysis of attribute validity of said initial set of data records; wherein said entity analysis tool enables analysis of entities associated with said initial set of data records; wherein said bucket analysis tool enables analysis of said buckets and an effect of said buckets on said identity hub for determining and linking data records associated with the common entity; and wherein said linkage analysis tool enables analysis of error rates associated with linking member records from said initial set of data records and thresholds utilized in scoring derivatives of said initial set of data records.
-
Specification