Tolerant and extensible discovery of relationships in data using structural information and data analysis
First Claim
1. A computer-implemented method of discovering relationships among a first set of elements with respect to a first data source and a second set of elements with respect to a second data source, comprising:
- applying a plurality of metric algorithms to a first structural description of the first set of elements and a second structural description of the second set of elements, wherein the first structural description describes a first structure of data in the first data source and the second structural description describes a second structure of data in the second data source, wherein each of the metric algorithms produces a metric result for each pair of elements in the first and the second sets, wherein each of the metric results for each pair of elements comprises a relationship strength value between the pair of elements in the first and second sets, wherein each of the relationship strength values produced using one of the metric algorithms for each pair of elements indicates a strength of a relationship between the pair of elements in the first and second sets that has a well defined meaning when compared with other relationship strength values;
producing raw results that include for each pair of first elements in the first and second sets the relationship strength values produced using the metric algorithms;
determining, for each element from the first set, a balanced result comprising a selected pair of pairs including the element from the first set and a determined element of the elements from the second set that has highest of the relationship strength values from the metric algorithms of the pairs with the element from the first set to produce a balanced result set that are a subset of the raw results, wherein pairs including each element from the first set with elements in the second set other than the determined element in the second set forming the balanced result set are excluded from the balanced results; and
returning the balanced result set.
1 Assignment
0 Petitions
Accused Products
Abstract
Various embodiments of a method, system and article of manufacture to discover relationships among a first set of elements and a second set of elements are provided. At least one metric algorithm is identified based on a metric selection parameter. A raw result is determined based on the at least one metric algorithm, a first specified structural description of the first set of elements and a second specified structural description of the second set of elements. The raw result comprises a plurality of relationship measurements and the raw result is ordered. In some embodiments, a balanced result is produced based on the raw result and a matching strategy algorithm. In other embodiments, the matching strategy algorithm is identified based on a matching strategy selection parameter.
67 Citations
34 Claims
-
1. A computer-implemented method of discovering relationships among a first set of elements with respect to a first data source and a second set of elements with respect to a second data source, comprising:
-
applying a plurality of metric algorithms to a first structural description of the first set of elements and a second structural description of the second set of elements, wherein the first structural description describes a first structure of data in the first data source and the second structural description describes a second structure of data in the second data source, wherein each of the metric algorithms produces a metric result for each pair of elements in the first and the second sets, wherein each of the metric results for each pair of elements comprises a relationship strength value between the pair of elements in the first and second sets, wherein each of the relationship strength values produced using one of the metric algorithms for each pair of elements indicates a strength of a relationship between the pair of elements in the first and second sets that has a well defined meaning when compared with other relationship strength values; producing raw results that include for each pair of first elements in the first and second sets the relationship strength values produced using the metric algorithms; determining, for each element from the first set, a balanced result comprising a selected pair of pairs including the element from the first set and a determined element of the elements from the second set that has highest of the relationship strength values from the metric algorithms of the pairs with the element from the first set to produce a balanced result set that are a subset of the raw results, wherein pairs including each element from the first set with elements in the second set other than the determined element in the second set forming the balanced result set are excluded from the balanced results; and returning the balanced result set. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. An article of manufacture comprising a computer usable storage medium embodying instructions executable by a computer for discovering relationships among a first set of elements with respect to a first data source and a second set of elements with respect to a second data source, wherein the instructions cause operations comprising:
-
applying a plurality of metric algorithms to a first structural description of the first set of elements and a second structural description of the second set of elements, wherein the first structural description describes a first structure of data in the first data source and the second structural description describes a second structure of data in the second data source, wherein each of the metric algorithms produces a metric result for each pair of elements in the first and the second sets, wherein each of the metric results for each pair of elements comprises a relationship strength value between the pair of elements in the first and second sets, wherein each of the relationship strength values produced using one of the metric algorithms for each pair of elements indicates a strength of a relationship between the pair of elements in the first and second sets that has a well defined meaning when compared with other relationship strength values; producing raw results that include for each pair of first elements in the first and second sets the relationship strength values produced using the metric algorithms; determining, for each element from the first set, a balanced result comprising a selected pair of pairs including the element from the first set and a determined element of the elements from the second set that has highest of the relationship strength values from the metric algorithms of the pairs with the element from the first set to produce a balanced result set that are a subset of the raw results, wherein pairs including each element from the first set with elements in the second set other than the determined element in the second set forming the balanced result set are excluded from the balanced results; and returning the balanced result set. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
-
-
24. A computer system for discovering relationships among a first set of elements with respect to a first data source and a second set of elements with respect to a second data source in at least storage system, comprising:
-
a processor; a computer readable storage medium having program components executed to perform operations, the operations comprising; applying a plurality of metric algorithms to a first structural description of the first set of elements and a second structural description of the second set of elements, wherein the first structural description describes a first structure of data in the first data source and the second structural description describes a second structure of data in the second data source, wherein each of the metric algorithms produces a metric result for each pair of elements in the first and the second sets, wherein each of the metric results for each pair of elements comprises a relationship strength value between the pair of elements in the first and second sets, wherein each of the relationship strength values produced using one of the metric algorithms for each pair of elements indicates a strength of a relationship between the pair of elements in the first and second sets that has a well defined meaning when compared with other relationship strength values; producing raw results that include for each pair of first elements in the first and second sets the relationship strength values produced using the metric algorithms; determining, for each element from the first set, a balanced result comprising a selected pair of pairs including the element from the first set and a determined element of the elements from the second set that has a highest of the relationship strength values from the metric algorithms of the pairs with the element from the first set to produce a balanced result set that are a subset of the raw results, wherein pairs including each element from the first set with elements in the second set other than the determined element in the second set forming the balanced result set are excluded from the balanced results; and returning the balanced result set. - View Dependent Claims (25, 26, 27, 28, 29, 30, 31, 32, 33, 34)
-
Specification