Identifying entity mappings across data assets
First Claim
1. A computer program product, the computer program product comprising a computer readable storage medium having program code embodied therewith, the program code executable by at least one processor to perform:
- generating entity mappings that produce matching entities for a first data asset having attributes with attribute values and a second data asset having attributes with attribute values by;
matching the attribute values of the attributes of the first data asset with the attribute values of the attributes of the second data asset;
using the matching attribute values to generate matching attribute pairs; and
using the matching attribute pairs to identify entity mappings;
computing an entity mapping score for each of the entity mappings based on a combination of factors;
ranking the entity mappings based on each entity mapping score; and
using the ranked entity mappings to determine which of the entity mappings are to be used to determine whether a same real-world entity is described by the first data asset and the second data asset.
1 Assignment
0 Petitions
Accused Products
Abstract
Entity mappings that produce matching entities for a first data asset having attributes and a second data asset having attributes are generated by: generating entity mappings that produce matching entities for a first data asset having attributes with attribute values and a second data asset having attributes with attribute values by: matching the attribute values of the attributes of the first data asset with the attribute values of the attributes of the second data asset, using the matching attribute values to generate matching attribute pairs, and using the matching attribute pairs to identify entity mappings; computing an entity mapping score for each of the entity mappings based on a combination of factors; ranking the entity mappings based on each entity mapping score; and using some of the ranked entity mappings to determine whether a same real-world entity is described by the first data asset and the second data asset.
-
Citations
16 Claims
-
1. A computer program product, the computer program product comprising a computer readable storage medium having program code embodied therewith, the program code executable by at least one processor to perform:
-
generating entity mappings that produce matching entities for a first data asset having attributes with attribute values and a second data asset having attributes with attribute values by; matching the attribute values of the attributes of the first data asset with the attribute values of the attributes of the second data asset; using the matching attribute values to generate matching attribute pairs; and using the matching attribute pairs to identify entity mappings; computing an entity mapping score for each of the entity mappings based on a combination of factors; ranking the entity mappings based on each entity mapping score; and using the ranked entity mappings to determine which of the entity mappings are to be used to determine whether a same real-world entity is described by the first data asset and the second data asset. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer system, comprising:
-
one or more processors, one or more computer-readable memories and one or more computer-readable, tangible storage devices; and program instructions, stored on at least one of the one or more computer-readable, tangible storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, to perform operations comprising; generating entity mappings that produce matching entities for a first data asset having attributes with attribute values and a second data asset having attributes with attribute values by; matching the attribute values of the attributes of the first data asset with the attribute values of the attributes of the second data asset; using the matching attribute values to generate matching attribute pairs; and using the matching attribute pairs to identify entity mappings; computing an entity mapping score for each of the entity mappings based on a combination of factors; ranking the entity mappings based on each entity mapping score; and using the ranked entity mappings to determine which of the entity mappings are to be used to determine whether a same real-world entity is described by the first data asset and the second data asset. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
Specification