Analysis of a system for matching data records
First Claim
1. A computer-implemented method for analyzing a system for matching data records, the method comprising:
- producing a configuration of said system for matching data records, the configuration of the system including a bucketing strategy employing matching functions and matching parameters to create buckets containing data records, wherein said buckets are created by comparing sets of one or more attributes of initial data records with corresponding attributes of candidate data records in said system, wherein each bucket is associated with a corresponding set of attributes, and wherein data records associated with a same entity are determined and linked by comparing one or more attributes of the initial data records to corresponding attributes of the candidate data records within the buckets in accordance with the matching functions and matching parameters;
applying said configuration to said system and analyzing buckets created during operation of said system according to the bucketing strategy associated with said configuration of said system;
analyzing an effect of said buckets on throughput of said system via a bucket analysis tool providing a user interface, wherein analyzing an effect of said buckets further comprises;
executing one or more queries from the user interface of the bucket analysis tool to produce characteristics associated with the buckets created during operation of the system, wherein the characteristics include distribution of data within the created buckets and data records not placed in the created buckets; and
identifying performance issues of the system from the characteristics of the created buckets produced from the one or more queries; and
modifying said configuration during operation of said system to adjust distribution of the data records within said buckets in real time to address the identified performance issues and enable the throughput of said system to reside within a predetermined desired range, wherein modifying said configuration includes;
changing said matching functions and matching parameters of said bucketing strategy for creating said buckets based on said identified performance issues to alter the comparing of said attributes and determination of the association of data records with the same entity for said buckets, wherein changing said matching functions and matching parameters includes providing a different combination of attributes for the corresponding set of attributes for at least one bucket.
1 Assignment
0 Petitions
Accused Products
Abstract
Embodiments disclosed herein provide a system and method for analyzing an identity hub. Particularly, a user can connect to the identity hub, load an initial set of data records, create and/or edit an identity hub configuration locally, analyze and/or validate the configuration via a set of analysis tools, including an entity analysis tool, a data analysis tool, a bucket analysis tool, and a linkage analysis tool, and remotely deploy the validated configuration to an identity hub instance. In some embodiments, through a graphical user interface, these analysis tools enable the user to analyze and modify the configuration of the identity hub in real time while the identity hub is operating to ensure data quality and enhance system performance.
280 Citations
19 Claims
-
1. A computer-implemented method for analyzing a system for matching data records, the method comprising:
-
producing a configuration of said system for matching data records, the configuration of the system including a bucketing strategy employing matching functions and matching parameters to create buckets containing data records, wherein said buckets are created by comparing sets of one or more attributes of initial data records with corresponding attributes of candidate data records in said system, wherein each bucket is associated with a corresponding set of attributes, and wherein data records associated with a same entity are determined and linked by comparing one or more attributes of the initial data records to corresponding attributes of the candidate data records within the buckets in accordance with the matching functions and matching parameters; applying said configuration to said system and analyzing buckets created during operation of said system according to the bucketing strategy associated with said configuration of said system; analyzing an effect of said buckets on throughput of said system via a bucket analysis tool providing a user interface, wherein analyzing an effect of said buckets further comprises; executing one or more queries from the user interface of the bucket analysis tool to produce characteristics associated with the buckets created during operation of the system, wherein the characteristics include distribution of data within the created buckets and data records not placed in the created buckets; and identifying performance issues of the system from the characteristics of the created buckets produced from the one or more queries; and modifying said configuration during operation of said system to adjust distribution of the data records within said buckets in real time to address the identified performance issues and enable the throughput of said system to reside within a predetermined desired range, wherein modifying said configuration includes; changing said matching functions and matching parameters of said bucketing strategy for creating said buckets based on said identified performance issues to alter the comparing of said attributes and determination of the association of data records with the same entity for said buckets, wherein changing said matching functions and matching parameters includes providing a different combination of attributes for the corresponding set of attributes for at least one bucket. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system for analyzing an identity system for matching data records, the system comprising:
at least one processor with logic to; produce a configuration of said identity system for matching data records, the configuration of the identity system including a bucketing strategy employing matching functions and matching parameters to create buckets containing data records, wherein said buckets are created by comparing sets of one or more attributes of initial data records with corresponding attributes of candidate data records in said identity system, wherein each bucket is associated with a corresponding set of attributes, and wherein data records associated with a same entity are determined and linked by comparing one or more attributes of the initial data records to corresponding attributes of the candidate data records within the buckets in accordance with the matching functions and matching parameters; apply said configuration to said identity system and analyze buckets created during operation of said identity system according to the bucketing strategy associated with said configuration of said identity system; analyze an effect of said buckets on throughput of said identity system via a bucket analysis tool providing a user interface, wherein analyzing an effect of said buckets further comprises; executing one or more queries from the user interface of the bucket analysis tool to produce characteristics associated with the buckets created during operation of the identity system, wherein the characteristics include distribution of data within the created buckets and data records not placed in the created buckets; and identifying performance issues of the identity system from the characteristics of the created buckets produced from the one or more queries; and modify said configuration during operation of said identity system to adjust distribution of the data records within said buckets in real time to address the identified performance issues and enable the throughput of said identity system to reside within a predetermined desired range, wherein modifying said configuration includes; changing said matching functions and matching parameters of said bucketing strategy for creating said buckets based on said identified performance issues to alter the comparing of said attributes and determination of the association of data records with the same entity for said buckets, wherein changing said matching functions and matching parameters includes providing a different combination of attributes for the corresponding set of attributes for at least one bucket. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17)
-
18. A non-transitory computer readable storage medium storing computer instructions executable by a processor for analyzing a system for matching data records, when executed by said processor, said computer instructions cause said processor to:
-
produce a configuration of said system for matching data records, the configuration of the system including a bucketing strategy employing matching functions and matching parameters to create buckets containing data records, wherein said buckets are created by comparing sets of one or more attributes of initial data records with corresponding attributes of candidate data records in said system, wherein each bucket is associated with a corresponding set of attributes, and wherein data records associated with a same entity are determined and linked by comparing one or more attributes of the initial data records to corresponding attributes of the candidate data records within the buckets in accordance with the matching functions and matching parameters; apply said configuration to said system and analyze buckets created during operation of said system according to the bucketing strategy associated with said configuration of said system; analyze an effect of said buckets on throughput of said system via a bucket analysis tool providing a user interface, wherein analyzing an effect of said buckets further comprises; executing one or more queries from the user interface of the bucket analysis tool to produce characteristics associated with the buckets created during operation of the system, wherein the characteristics include distribution of data within the created buckets and data records not placed in the created buckets; and identifying performance issues of the system from the characteristics of the created buckets produced from the one or more queries; and modify said configuration during operation of said system to adjust distribution of the data records within said buckets in real time to address the identified performance issues and enable the throughput of said system to reside within a predetermined desired range, wherein modifying said configuration includes; changing said matching functions and matching parameters of said bucketing strategy for creating said buckets based on said identified performance issues to alter the comparing of said attributes and determination of the association of data records with the same entity for said buckets, wherein changing said matching functions and matching parameters includes providing a different combination of attributes for the corresponding set of attributes for at least one bucket. - View Dependent Claims (19)
-
Specification