USING VERTEX SELF-INFORMATION SCORES FOR VERTICES IN AN ENTITY GRAPH TO DETERMINE WHETHER TO PERFORM ENTITY RESOLUTION ON THE VERTICES IN THE ENTITY GRAPH
1 Assignment
0 Petitions
Accused Products
Abstract
Provided are a computer program product, system, and method to determine whether to perform entity resolution on vertices in an entity graph. A determination is made of pairs of records in a database having a relationship value satisfying a threshold. An entity relationship graph has a vertex for each of the records of the pairs and an edge between two vertices. Each vertex has a self-information score based on content in the record, an initial unique entity identifier, and an entity information score. For each subject vertex of the vertices, a determination is made of a target vertex directly connected to the subject vertex that has a highest entity information score and whether to set the subject vertex entity identifier and entity information score to the entity identifier and entity information score of the target vertex based on the target vertex self-information score.
-
Citations
24 Claims
-
1-15. -15. (canceled)
-
16. A method for entity resolution of records in a database, comprising:
-
determining pairs of records in the database having a relationship value satisfying a threshold; generating an entity relationship graph having a vertex for each of the records of the pairs and an edge for each of the determined pairs between two vertices representing records in one of the determined pairs, wherein each vertex is associated with a self-information score based on content in the record represented by the vertex and is assigned an initial unique entity identifier and an entity information score, which is initially set to the information score of the vertex; for each subject vertex of the vertices, performing; determining a target vertex directly connected to the subject vertex that has a highest entity information score of at least one vertex directly connected to the subject vertex that has an entity information score greater than the entity information score of the subject vertex; and determining whether to set the subject vertex entity identifier and entity information score to the entity identifier and entity information score of the target vertex based on the target vertex self-information score. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24)
-
Specification