Data enrichment using business compendium
First Claim
1. A computer system comprising:
- one or more processors;
a software program, executable on said computer system, the software program configured to;
cause an enrichment engine to receive an input data set comprising a first supplier and a second supplier;
cause the enrichment engine to perform standardization and cleansing of the input data set, the standardization comprising validating supplier addresses to a locality/city level;
cause the enrichment engine to identify duplicate entries in the input data set;
cause a matching component of the enrichment engine to perform matching of non-duplicate entries of the input data set to a business compendium having data compiled from a third party source, wherein the matching is performed according to criteria comprising,a first priority comprising name or address having a score greater than a minimum and changed manually in the past,a second priority comprising name or address having the score greater than the minimum and generated by a matching engine,a third priority comprising the third party source as a preferred data provider, anda fourth priority comprising a most recent updating;
cause the enrichment engine to create an enriched data set including additional information based upon the matching, wherein the additional information comprises common corporate ownership and a unique supplier location indicating the first supplier as a subsidiary of the second supplier; and
providing the enriched data set to a user for manual review.
2 Assignments
0 Petitions
Accused Products
Abstract
Embodiments relate to enrichment of a data warehouse utilizing a business compendium. Embodiments may employ a process comprising data standardization and cleansing, de-duplication of entries, and matching and enrichment, followed by manual review of an enriched record by a user. During standardization, data may be transformed into consistent content, placing correct data elements into appropriate fields, removing invalid characters, and/or standardizing names and addresses. Duplicate records are then detected and marked. During matching and enrichment, the existence of an entity (such as a supplier), may be verified by progressive matching against the business compendium. Enrichment may provide additional information regarding the entity (e.g. related to risk, diversity, and bankruptcy). The enriched record is available for manual review, allowing the user to change duplicates, matches, and parent/child linkages. Feedback from the user review may enhance accuracy of subsequent enrichment by self-learning aspects, reducing over time a need for manual review.
53 Citations
8 Claims
-
1. A computer system comprising:
-
one or more processors; a software program, executable on said computer system, the software program configured to; cause an enrichment engine to receive an input data set comprising a first supplier and a second supplier; cause the enrichment engine to perform standardization and cleansing of the input data set, the standardization comprising validating supplier addresses to a locality/city level; cause the enrichment engine to identify duplicate entries in the input data set; cause a matching component of the enrichment engine to perform matching of non-duplicate entries of the input data set to a business compendium having data compiled from a third party source, wherein the matching is performed according to criteria comprising, a first priority comprising name or address having a score greater than a minimum and changed manually in the past, a second priority comprising name or address having the score greater than the minimum and generated by a matching engine, a third priority comprising the third party source as a preferred data provider, and a fourth priority comprising a most recent updating; cause the enrichment engine to create an enriched data set including additional information based upon the matching, wherein the additional information comprises common corporate ownership and a unique supplier location indicating the first supplier as a subsidiary of the second supplier; and providing the enriched data set to a user for manual review. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
Specification