Fact-based object merging
First Claim
1. A computer-implemented method of merging objects stored in a memory and associated with a same entity, comprising:
- identifying a plurality of merge candidate objects, wherein each merge candidate object is created using facts extracted from one or more electronic documents, each fact comprises an attribute and a value, each merge candidate object comprises one or more facts describing an entity with which the object is associated, each merge candidate object includes at least one fact with a same attribute, and each merge candidate object is distinct from the one or more electronic documents and the entity associated with the object;
grouping the plurality of merge candidate objects in accordance with values corresponding to the same attribute in the at least one fact, the grouping including assigning a respective merge candidate object to a respective group of a plurality of groups, the respective group corresponding to the value of the at least one fact with the same attribute in the respective merge candidate object;
identifying similarities between objects in each group, the identifying including computing a similarity value that indicates an amount of similarity between a pair of objects in the group;
generating one or more graphs describing identified similarities among the objects in all of the groups;
analyzing the one or more graphs describing the similarities among the objects to identify two or more objects associated with the same entity;
merging the two or more objects associated with the same entity to produce a merged object that includes facts of the two or more objects associated with the same entity; and
storing the merged object in a repository in the memory.
2 Assignments
0 Petitions
Accused Products
Abstract
A repository contains objects including facts about entities. Some objects might be associated with the same entity. An object merge engine identifies a set of merge candidate objects. A grouping module groups the merge candidate objects based on the values of facts included in the objects. An object comparison module compares pairs of objects in each group to identify evidence for and/or against merging the pair. Evidence for merging the pair exists if, e.g., the objects have a type in common or share an uncommon fact. Evidence against merging the pair exists if, e.g., the objects have differing singleton attributes. A graph generation module generates graphs describing the evidence for and/or against merging the pair. A merging module analyzes the graphs and merges objects associated with the same entity. The merged objects are stored in the repository.
225 Citations
30 Claims
-
1. A computer-implemented method of merging objects stored in a memory and associated with a same entity, comprising:
-
identifying a plurality of merge candidate objects, wherein each merge candidate object is created using facts extracted from one or more electronic documents, each fact comprises an attribute and a value, each merge candidate object comprises one or more facts describing an entity with which the object is associated, each merge candidate object includes at least one fact with a same attribute, and each merge candidate object is distinct from the one or more electronic documents and the entity associated with the object; grouping the plurality of merge candidate objects in accordance with values corresponding to the same attribute in the at least one fact, the grouping including assigning a respective merge candidate object to a respective group of a plurality of groups, the respective group corresponding to the value of the at least one fact with the same attribute in the respective merge candidate object; identifying similarities between objects in each group, the identifying including computing a similarity value that indicates an amount of similarity between a pair of objects in the group; generating one or more graphs describing identified similarities among the objects in all of the groups; analyzing the one or more graphs describing the similarities among the objects to identify two or more objects associated with the same entity; merging the two or more objects associated with the same entity to produce a merged object that includes facts of the two or more objects associated with the same entity; and storing the merged object in a repository in the memory. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A computer system for merging objects associated with a same entity, the computer system comprising:
-
a grouping module for identifying a plurality of merge candidate objects, wherein each merge candidate object is created using facts extracted from one or more electronic documents, each fact comprises an attribute and a value, each merge candidate object comprises one or more facts describing an entity with which the object is associated, each merge candidate object includes at least one fact with a same attribute, and each merge candidate object is distinct from the one or more electronic documents and the entity associated with the object, and for grouping the plurality of merge candidate objects in accordance with values corresponding to the same attribute in the at least one fact, the grouping including assigning a respective merge candidate object to a respective group of a plurality of groups, the respective group corresponding to the value of the at least one fact with the same attribute in the respective merge candidate object; an object comparison module for identifying similarities between objects in each group, the identifying including computing a similarity value that indicates an amount of similarity between a pair of objects in the group; a graph generation module for generating one or more graphs describing identified similarities among the objects of all of the groups; and a merging module for analyzing the one or more graphs describing the similarities among the objects to identify two or more objects associated with the same entity, merging the two or more objects associated with the same entity to produce a merged object that includes facts of the two or more objects associated with the same entity, and storing the merged object in a repository. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21)
-
-
22. A non-transitory computer-readable storage medium storing one or more instructions for execution by one or more processors, the one or more instructions comprising:
-
a grouping module for identifying a plurality of merge candidate objects, wherein each merge candidate object is created using facts extracted from one or more electronic documents, each fact comprises an attribute and a value, each merge candidate object comprises one or more facts describing an entity with which the object is associated, each merge candidate object includes at least one fact with a same attribute, and each merge candidate object is distinct from the one or more electronic documents and the entity associated with the object, and for grouping the plurality of merge candidate objects in accordance with values corresponding to the same attribute in the at least one fact, the grouping including assigning a respective merge candidate object to a respective group of a plurality of groups, the respective group corresponding to the value of the at least one fact with the same attribute in the respective merge candidate object; an object comparison module for identifying similarities between objects in each group, the identifying including computing a similarity value that indicates an amount of similarity between a pair of objects in the group; a graph generation module for generating one or more graphs describing identified similarities among the objects in all of the groups; and a merging module for analyzing the one or more graphs describing the similarities among the objects to identify two or more objects associated with the same entity, merging the two or more objects associated with the same entity to produce a merged object that includes facts of the two or more objects associated with the same entity, and storing the merged object in a repository. - View Dependent Claims (23, 24, 25, 26, 27, 28, 29, 30)
-
Specification