DATA PROCESSING
First Claim
1. A method, said method comprising:
- identifying, by one or more processors of a computer system, a plurality of entities within a first data source;
for each entity identified within the first data source, said one or more processors identifying within the first data source attributes of the entity identified within the first data source and/or relationships between the entity identified within the first data source and other entities identified within the first data source, and associating the attributes and/or relationships identified within the first data source with a first entity identified within a data structure;
said one or more processors generating, for each entity identified within the first data source, a frequency metric characterizing the entity identified within the first data source, said frequency metric based on a frequency at which each attribute and/or relationship identified within the first data source is associated with the entity identified within the first data source, andsaid one or more processors identifying a degree of similarity between two entities of the plurality of entities by comparing the respective frequency metrics of the two entities.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and associated system. Entities within a first data source are identified. For each entity identified within the first data source, attributes of the entity identified within the first data source and/or relationships between the entity identified within the first data source and other entities identified within the first data source are identified. The attributes and/or relationships identified within the first data source are associated with a first entity identified within a data structure. For each entity identified within the first data source, a frequency metric characterizing the entity identified within the first data source is generated. The frequency metric is based on a frequency at which each attribute and/or relationship identified within the first data source is associated with the entity identified within the first data source. A degree of similarity between two entities of the entities is identified, by comparing the frequency metrics of the two entities.
12 Citations
25 Claims
-
1. A method, said method comprising:
-
identifying, by one or more processors of a computer system, a plurality of entities within a first data source; for each entity identified within the first data source, said one or more processors identifying within the first data source attributes of the entity identified within the first data source and/or relationships between the entity identified within the first data source and other entities identified within the first data source, and associating the attributes and/or relationships identified within the first data source with a first entity identified within a data structure; said one or more processors generating, for each entity identified within the first data source, a frequency metric characterizing the entity identified within the first data source, said frequency metric based on a frequency at which each attribute and/or relationship identified within the first data source is associated with the entity identified within the first data source, and said one or more processors identifying a degree of similarity between two entities of the plurality of entities by comparing the respective frequency metrics of the two entities. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
-
-
24. A computer program product, comprising one or more computer readable hardware storage devices having computer readable program code stored therein, said program code containing instructions executable by one or more processors of a computer system to implement a method, said method comprising:
-
said one or more processors identifying a plurality of entities within a first data source; for each entity identified within the first data source, said one or more processors identifying within the first data source attributes of the entity identified within the first data source and/or relationships between the entity identified within the first data source and other entities identified within the first data source, and associating the attributes and/or relationships identified within the first data source with a first entity identified within a data structure; said one or more processors generating, for each entity identified within the first data source, a frequency metric characterizing the entity identified within the first data source, said frequency metric based on a frequency at which each attribute and/or relationship identified within the first data source is associated with the entity identified within the first data source; and said one or more processors identifying a degree of similarity between two entities of the plurality of entities by comparing the respective frequency metrics of the two entities.
-
-
25. A computer system, comprising one or more processors, one or more memories, and one or more computer readable hardware storage devices, said one or more hardware storage device containing program code executable by the one or more processors via the one or more memories to implement a method, said method comprising:
-
said one or more processors identifying a plurality of entities within a first data source; for each entity identified within the first data source, said one or more processors identifying within the first data source attributes of the entity identified within the first data source and/or relationships between the entity identified within the first data source and other entities identified within the first data source, and associating the attributes and/or relationships identified within the first data source with a first entity identified within a data structure; said one or more processors generating, for each entity identified within the first data source, a frequency metric characterizing the entity identified within the first data source, said frequency metric based on a frequency at which each attribute and/or relationship identified within the first data source is associated with the entity identified within the first data source; and said one or more processors identifying a degree of similarity between two entities of the plurality of entities by comparing the respective frequency metrics of the two entities.
-
Specification