Internal Linking Co-Convergence Using Clustering With Hierarchy
First Claim
1. A computer-implemented method comprising:
- clustering hierarchical database records into a first set of clusters having corresponding first cluster identifications (IDs), each hierarchical database record comprising one or more field values, the clustering based at least in part on determining similarity among corresponding field values of the hierarchical database records;
determining parent-child hierarchical relationships among the hierarchical database records;
associating related hierarchical database records by applying a hierarchal directional linking process, the hierarchal directional linking process comprising selecting and applying at least an upward process based on the determined parent-child hierarchical relationship wherein the upward process comprises;
determining, from the parent-child hierarchical relationships, similarity among a plurality of child records having separate parent records; and
in response to determining a threshold similarity among the plurality of child records, inferring that the separate parent records correspond to the same entity;
re-clustering at least a portion of the database records into a second set of clusters having corresponding second cluster IDs, the re-clustering based at least in part on the associating related hierarchical database records and on the determining similarity among corresponding field values of the database records; and
outputting database record information, based at least in part on the re-clustering.
1 Assignment
0 Petitions
Accused Products
Abstract
Certain implementations of the disclosed technology include systems and methods for internal co-convergence using clustering when there is hierarchy in the data structure. A method is included for clustering hierarchical database records into a first set of clusters having corresponding first cluster identifications (IDs), each hierarchical database record including one or more field values, the clustering based at least in part on determining similarity among corresponding field values of the hierarchical database records. The method includes receiving parent-child hierarchical relationship information for the hierarchical database records, re-clustering at least a portion of the hierarchical database records into a second set of clusters having corresponding second cluster IDs, the re-clustering based at least in part on the received parent-child hierarchical relationship information, and outputting hierarchical database record information, based at least in part on the re-clustering.
8 Citations
23 Claims
-
1. A computer-implemented method comprising:
-
clustering hierarchical database records into a first set of clusters having corresponding first cluster identifications (IDs), each hierarchical database record comprising one or more field values, the clustering based at least in part on determining similarity among corresponding field values of the hierarchical database records; determining parent-child hierarchical relationships among the hierarchical database records; associating related hierarchical database records by applying a hierarchal directional linking process, the hierarchal directional linking process comprising selecting and applying at least an upward process based on the determined parent-child hierarchical relationship wherein the upward process comprises; determining, from the parent-child hierarchical relationships, similarity among a plurality of child records having separate parent records; and in response to determining a threshold similarity among the plurality of child records, inferring that the separate parent records correspond to the same entity; re-clustering at least a portion of the database records into a second set of clusters having corresponding second cluster IDs, the re-clustering based at least in part on the associating related hierarchical database records and on the determining similarity among corresponding field values of the database records; and outputting database record information, based at least in part on the re-clustering. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer-implemented method comprising:
-
clustering hierarchical database records into a first set of clusters having corresponding first cluster identifications (IDs), each hierarchical database record comprising one or more field values, the clustering based at least in part on determining similarity among corresponding field values of the hierarchical database records; receiving parent-child hierarchical relationship information for the hierarchical database records; re-clustering at least a portion of the hierarchical database records into a second set of clusters having corresponding second cluster IDs, the re-clustering based at least in part on the received parent-child hierarchical relationship information; and outputting hierarchical database record information, based at least in part on the re-clustering. - View Dependent Claims (10, 11, 12, 13, 14)
-
-
15. A system comprising:
-
at least one memory for storing data and computer-executable instructions; and at least one processor configured to access the at least one memory and further configured to execute the computer-executable instructions for; clustering hierarchical database records into a first set of clusters having corresponding first cluster identifications (IDs), each hierarchical database record comprising one or more field values, the clustering based at least in part on determining similarity among corresponding field values of the hierarchical database records; when a hierarchy structure of the hierarchical database records is unavailable; determining parent-child hierarchical relationships among the hierarchical database records; associating related hierarchical database records by applying a hierarchal directional linking process, the hierarchal directional linking process comprising selecting and applying at least an upward process based on the determined parent-child hierarchical relationship wherein the upward process comprises; determining, from the parent-child hierarchical relationships, similarity among a plurality of child records having separate parent records; and in response to determining a threshold similarity among that the plurality of child records, inferring that the separate parent records correspond to the same entity; re-clustering at least a portion of the hierarchical database records into a second set of clusters having corresponding second cluster IDs, the re-clustering based at least in part on the associating related hierarchical database records and on the determining similarity among corresponding field values of the database records; and when a hierarchy structure of the hierarchical database records is available; receiving parent-child hierarchical relationship information for the hierarchical database records; re-clustering at least a portion of the hierarchical database records into a second set of clusters having corresponding second cluster IDs, the re-clustering based at least in part on the received parent-child hierarchical relationship information; and outputting hierarchical database record information, based at least in part on the re-clustering. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22)
-
-
23. A non-transitory computer readable media comprising computer-executable instructions that, when executed by one or more processors, cause the one or more processors to perform a method comprising:
-
clustering hierarchical database records into a first set of clusters having corresponding first cluster identifications (IDs), each hierarchical database record comprising one or more field values, the clustering based at least in part on determining similarity among corresponding field values of the hierarchical database records; when a hierarchy structure of the hierarchical database records is unavailable; determining parent-child hierarchical relationships among the hierarchical database records; associating related hierarchical database records by applying a hierarchal directional linking process, the hierarchal directional linking process comprising selecting and applying at least an upward process based on the determined parent-child hierarchical relationship wherein the upward process comprises; determining, from the parent-child hierarchical relationships, similarity among a plurality of child records having separate parent records; and in response to determining a threshold similarity among that the plurality of child records, inferring that the separate parent records correspond to the same entity; re-clustering at least a portion of the hierarchical database records into a second set of clusters having corresponding second cluster IDs, the re-clustering based at least in part on the associating related hierarchical database records and on the determining similarity among corresponding field values of the database records; and when a hierarchy structure of the hierarchical database records is available; receiving parent-child hierarchical relationship information for the hierarchical database records; re-clustering at least a portion of the hierarchical database records into a second set of clusters having corresponding second cluster IDs, the re-clustering based at least in part on the received parent-child hierarchical relationship information; and outputting hierarchical database record information, based at least in part on the re-clustering.
-
Specification