Asset repository hub
First Claim
1. A method of managing data assets in an enterprise computing environment having a plurality of source systems storing a plurality of data assets, comprising:
- transmitting, using one or more computer processors, a registration request from a source system to an asset repository hub including a master reference schema, the registration request including a reference to a data asset within the enterprise computing environment;
receiving the registration request at the asset repository hub;
using the received registration request to obtain the referenced data asset;
cleansing the referenced data asset against a set of business rules;
applying one or more matching algorithms to the cleansed referenced data asset to generate one or more match codes, wherein each matching algorithm includes a set of parameters, and wherein each generated match code includes string data configured for imprecise matching within the master reference schema, such that a plurality of match codes can be matched to a single data asset; and
clustering the generated one or more match codes against clusters of previously generated match codes to determine whether the cleansed referenced data asset is unique to the enterprise computing environment, wherein when the cleansed referenced data asset is unique, the cleansed referenced data asset is assigned a unique enterprise identifier and stored at the asset repository hub.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems and methods for managing data assets in an enterprise computing environment are provided. Data assets associated with a plurality of source systems operating within the enterprise computing environment may be registered with an asset repository hub. The asset repository hub receives a registration request from a source system for a particular data asset and determines whether the data asset is unique within the enterprise computing environment. If the data asset is unique, then the data asset is stored at the asset repository hub and a unique enterprise identifier is associated with the data asset. In determining whether the asset is unique, the asset repository hub may cleanse the data asset against a set of business rules; generate a plurality of match codes that describe the content of the data asset; and cluster the generated match codes against clusters of previously generated match codes. Also provided herein is a mechanism for searching and locating data assets stored within the enterprise computing environment by submitting queries to the asset repository hub.
-
Citations
23 Claims
-
1. A method of managing data assets in an enterprise computing environment having a plurality of source systems storing a plurality of data assets, comprising:
-
transmitting, using one or more computer processors, a registration request from a source system to an asset repository hub including a master reference schema, the registration request including a reference to a data asset within the enterprise computing environment; receiving the registration request at the asset repository hub; using the received registration request to obtain the referenced data asset; cleansing the referenced data asset against a set of business rules; applying one or more matching algorithms to the cleansed referenced data asset to generate one or more match codes, wherein each matching algorithm includes a set of parameters, and wherein each generated match code includes string data configured for imprecise matching within the master reference schema, such that a plurality of match codes can be matched to a single data asset; and clustering the generated one or more match codes against clusters of previously generated match codes to determine whether the cleansed referenced data asset is unique to the enterprise computing environment, wherein when the cleansed referenced data asset is unique, the cleansed referenced data asset is assigned a unique enterprise identifier and stored at the asset repository hub. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21)
-
-
22. A method of managing data assets in an enterprise computing environment having a plurality of source systems storing a plurality of data assets, comprising:
-
receiving, using one or more computer processors, a registration request at an asset repository hub including a master reference schema, the registration request including a reference to a data asset within the enterprise computing environment; using the received registration request to obtain the referenced data asset; cleansing the referenced data asset against a set of business rules; applying one or more matching algorithms to the cleansed referenced data asset to generate one or more match codes, wherein each matching algorithm includes a set of parameters, and wherein each generated match code includes string data configured for imprecise matching within the master reference schema, such that a plurality of match codes can be matched to a single data asset; and clustering the generated one or more match codes against clusters of previously generated match codes to determine whether the cleansed referenced data asset is unique to the enterprise computing environment; determining that the cleansed referenced data asset is unique; and assigning the cleansed referenced data asset a unique enterprise identifier and storing the cleansed referenced data asset at the asset repository hub.
-
-
23. A system, comprising:
-
one or more processors; one or more computer-readable storage mediums containing software instructions executable on the one or more processors to cause the one or more processors to perform operations including; receiving, using one or more computer processors, a registration request at an asset repository hub including a master reference schema, the registration request including a reference to a data asset within the enterprise computing environment; using the received registration request to obtain the referenced data asset; cleansing the referenced data asset against a set of business rules; applying one or more matching algorithms to the cleansed referenced data asset to generate one or more match codes, wherein each matching algorithm includes a set of parameters, and wherein each generated match code includes string data configured for imprecise matching within the master reference schema, such that a plurality of match codes can be matched to a single data asset; and clustering the generated one or more match codes against clusters of previously generated match codes to determine whether the cleansed referenced data asset is unique to the enterprise computing environment; determining that the cleansed referenced data asset is unique; and assigning the cleansed referenced data asset a unique enterprise identifier and storing the cleansed referenced data asset at the asset repository hub.
-
Specification