Reducing churn in knowledge graphs
First Claim
1. A method for reducing churn in identifier assignment for entities in a knowledge graph, comprising:
- identifying a plurality of aliases for a plurality of entities maintained in the knowledge graph, wherein each alias of the plurality of aliases is associated with one entity of the plurality of entities, and wherein each entity of the plurality of entities is associated with an entity identifier;
for each alias of the plurality of aliases, associating the alias with the entity identifier for the entity to which the alias is associated; and
in response to an update to the knowledge graph;
clustering the plurality of aliases based on the update into a plurality of alias clusters, wherein each alias retains the association with the entity identifier made prior to the update; and
for each alias cluster of the plurality of alias clusters, associating the alias cluster with one entity of the plurality of entities by assigning an entity identifier of the one entity to the alias cluster, wherein assigning the entity identifier of the one entity to the alias cluster comprises;
identifying each unique entity identifier associated with aliases of the alias cluster;
for each unique entity identifier, determining a number of the aliases associated with the entity identifier;
identifying, as a most frequent entity identifier, one entity identifier among each unique entity identifier that has a highest determined number of the aliases associated; and
assigning the most frequent entity identifier to the alias cluster and to the aliases of the alias cluster.
1 Assignment
0 Petitions
Accused Products
Abstract
Reductions in churn for assigning identifiers to entities in a knowledge graph enables several improvements to the functionality of the computing devices maintaining or accessing knowledge graphs. As the aliases or other terms used to identify a given entity change in response to updates to the knowledge graph, the identifiers assigned to various entities may change. For example, two individual entities conflated as one node may be split into two or two nodes merged into one in response to an update, and existing identifiers should be used to reduce churn. To select the existing identifiers to assign to a given updated entity, the aliases are clustered with the updated entities and the unique modal prior identifier is assigned as the identifier for the updated entity. Higher orders of modality are used to ensure as many existing identifiers are used before creating new identifiers.
-
Citations
20 Claims
-
1. A method for reducing churn in identifier assignment for entities in a knowledge graph, comprising:
-
identifying a plurality of aliases for a plurality of entities maintained in the knowledge graph, wherein each alias of the plurality of aliases is associated with one entity of the plurality of entities, and wherein each entity of the plurality of entities is associated with an entity identifier; for each alias of the plurality of aliases, associating the alias with the entity identifier for the entity to which the alias is associated; and in response to an update to the knowledge graph; clustering the plurality of aliases based on the update into a plurality of alias clusters, wherein each alias retains the association with the entity identifier made prior to the update; and for each alias cluster of the plurality of alias clusters, associating the alias cluster with one entity of the plurality of entities by assigning an entity identifier of the one entity to the alias cluster, wherein assigning the entity identifier of the one entity to the alias cluster comprises; identifying each unique entity identifier associated with aliases of the alias cluster; for each unique entity identifier, determining a number of the aliases associated with the entity identifier; identifying, as a most frequent entity identifier, one entity identifier among each unique entity identifier that has a highest determined number of the aliases associated; and assigning the most frequent entity identifier to the alias cluster and to the aliases of the alias cluster. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system for reducing churn in identifier assignment for entities in a knowledge graph, comprising:
-
a processor; and a memory storage device including instructions that when executed by the processor are operable to; maintain aliases in association with the entities from the knowledge graph, wherein each alias is associated in an alias cluster with an entity identifier; and in response to an update to the knowledge graph; produce a plurality of-updated alias clusters, wherein each alias retains the association with the entity identifier made prior to the update; and for each updated alias cluster of the plurality of updated alias clusters, associate the updated alias cluster with one entity of the entities from the knowledge graph by assigning an entity identifier of the one entity to the updated alias cluster, wherein assigning the entity identifier of the one entity to the updated alias cluster comprises; identifying each unique entity identifier associated with aliases of the updated alias cluster; for each unique entity identifier, determining a number of the aliases associated with the entity identifier; identifying, as a modal entity identifier of the updated alias cluster, one entity identifier among each unique entity identifier that has a highest determined number of the aliases associated; and assigning the modal entity identifier to the updated alias cluster and to the aliases of the updated alias cluster. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A computer readable storage device including instructions reducing churn in identifier assignment for entities in a knowledge graph that when executed by a processor in response to an update to the knowledge graph comprise:
-
clustering, based on the update, a plurality of entity aliases into a plurality of alias clusters, wherein each alias is associated with a pre-update entity identifier to which each alias was associated prior to the update; and for each alias cluster of the plurality of alias clusters, associating the alias cluster with one entity in the knowledge graph by assigning an entity identifier of the one entity to the alias cluster, wherein assigning the entity identifier of the one entity to the alias cluster comprises; identifying each unique entity identifier associated with aliases of the alias cluster; for each unique entity identifier, determining a number of the aliases associated with the entity identifier; identifying, as a most frequent entity identifier, one entity identifier among each unique entity identifier that has a highest determined number of the aliases associated; and assigning the most frequent entity identifier to the alias cluster and the aliases of the alias cluster. - View Dependent Claims (18, 19, 20)
-
Specification