Method and process to optimize correlation of replicated with extracted data from disparated data sources
First Claim
1. A computer-readable medium comprising computer-executable instructions for performing, when run on a computer system, an operation of correlating at least a first plurality of data records and a second plurality of data records, each data record of the first plurality of data records being uniquely identified within a corresponding data source by an associated internal identifier and each data record of the first and second plurality of data records comprising at least one external identifier, the operation comprising:
- determining a data record of the first plurality of data records and at least one data record of the second plurality of data records having an identical external identifier; and
mapping the at least one determined data record of the second plurality of data records to the internal identifier associated with the determined data record of the first plurality of data records.
1 Assignment
0 Petitions
Accused Products
Abstract
A method, a data structure, a computer program product and a computer-readable medium for correlating at least a first plurality of data records and a second plurality of data records, each data record of the first plurality of data records being uniquely identified within a corresponding data source by an associated internal identifier and each data record of the first and second plurality of data records comprising at least one external identifier. According to one embodiment, the method comprises determining a data record of the first plurality of data records and at least one data record of the second plurality of data records having an identical external identifier; and mapping the at least one determined data record of the second plurality of data records to the internal identifier associated with the determined data record of the first plurality of data records.
-
Citations
40 Claims
-
1. A computer-readable medium comprising computer-executable instructions for performing, when run on a computer system, an operation of correlating at least a first plurality of data records and a second plurality of data records, each data record of the first plurality of data records being uniquely identified within a corresponding data source by an associated internal identifier and each data record of the first and second plurality of data records comprising at least one external identifier, the operation comprising:
-
determining a data record of the first plurality of data records and at least one data record of the second plurality of data records having an identical external identifier; and
mapping the at least one determined data record of the second plurality of data records to the internal identifier associated with the determined data record of the first plurality of data records. - View Dependent Claims (2, 3, 4)
-
-
5. A computer-readable medium comprising computer-executable instructions for performing, when run on a computer system, an operation of creating a data warehouse mapping data structure to correlate at least two different data sources, the operation comprising:
-
creating a plurality of mapping data records in the warehouse mapping data structure, each mapping data record comprising;
a first value representing an internal identifier uniquely identifying the mapping data record in the warehouse mapping data structure;
a second value representing an external identifier of one of a data record of a first data source and a data record of a second data source; and
a third value representing an internal identifier uniquely identifying a data record of the second data source in the second data source, the data record of the second data source having the second value as external identifier;
whereby a correlation between the first and the second data sources is established. - View Dependent Claims (6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A computer-readable medium comprising computer-executable instructions for performing, when run on a computer system, an operation of correlating data from at least two different data sources in a data warehouse, the operation comprising:
-
loading data from a first data source into the data warehouse, the data from the first data source comprising a plurality of first internal identifiers and a plurality of first external identifiers;
creating a warehouse mapping data structure on the basis of the plurality of internal identifiers each mapped to an associated first external identifier;
loading data from a second data source into the data warehouse, the data from the second data source comprising a plurality of second internal identifiers each associated with a second external identifier, wherein at least one of the associated second external identifiers is identical to one of the first external identifiers; and
mapping each second internal identifier associated with a second external identifier that is identical to one of the first external identifiers in the warehouse mapping data structure to the first internal identifier of the identical matching first external identifier, whereby a correlation between data of the first and the second data sources is established. - View Dependent Claims (19, 20, 21, 22, 23, 24, 25, 26, 27)
-
-
28. A method of correlating at least a first plurality of data records and a second plurality of data records, each data record of the first plurality of data records being uniquely identified within a corresponding data source by an associated internal identifier and each data record of the first and second plurality of data records comprising at least one external identifier, the method comprising:
-
determining a data record of the first plurality of data records and at least one data record of the second plurality of data records having an identical external identifier; and
mapping the at least one determined data record of the second plurality of data records to the internal identifier associated with the determined data record of the first plurality of data records.
-
-
29. A method of creating a data warehouse mapping data structure to correlate at least two different data sources, the method comprising:
-
creating a plurality of mapping data records in the warehouse mapping data structure, each mapping data record comprising;
a first value representing an internal identifier uniquely identifying the mapping data record in the warehouse mapping data structure;
a second value representing an external identifier of one of a data record of a first data source and a data record of a second data source; and
a third value representing an internal identifier uniquely identifying a data record of the second data source in the second data source, the data record of the second data source having the second value as external identifier;
whereby a correlation between the first and the second data sources is established.
-
-
30. A method of correlating data from at least two different data sources in a data warehouse, the method comprising:
-
loading data from a first data source into the data warehouse, the data from the first data source comprising a plurality of first internal identifiers and a plurality of first external identifiers;
creating a warehouse mapping data structure on the basis of the plurality of internal identifiers each mapped to an associated first external identifier;
loading data from a second data source into the data warehouse, the data from the second data source comprising a plurality of second internal identifiers each associated with a second external identifier, wherein at least one of the associated second external identifiers is identical to one of the first external identifiers; and
mapping each second internal identifier associated with a second external identifier that is identical to one of the first external identifiers in the warehouse mapping data structure to the first internal identifier of the identical matching first external identifier, whereby a correlation between data of the first and the second data sources is established.
-
-
31. A mapping data structure residing in storage, the mapping data structure comprising a plurality of data records, each data record comprising:
-
a first portion comprising a warehouse internal identifier;
a second portion comprising an external identifier common to a first data source and a second data source; and
a third portion comprising an internal identifier of the second data source;
wherein at least one data record of the plurality of data records comprises a warehouse internal identifier representing an internal identifier of the first data source, a common external identifier associated with the internal identifier of the first data source and an internal identifier of the second data source associated with the common external identifier, whereby a correlation between data of the first and the second data sources is established. - View Dependent Claims (32)
-
-
33. A warehouse mapping table residing in storage, comprising:
a plurality of external identifiers common to a first data source and a second data source, a warehouse internal identifier for each of the plurality of external identifiers, and an internal identifier of the second data source for a least a portion of each of the warehouse internal identifiers, whereby data from the first and second data sources is correlated. - View Dependent Claims (34, 35)
-
36. A computer, comprising:
-
a memory containing at least;
a data warehouse for storing data of a first and a second data source; and
a mapping data structure for correlating the data of the first and second data sources in the data warehouse, the mapping data structure comprising a plurality of data records, each data record in the mapping data structure comprising;
a first portion comprising a warehouse internal identifier;
a second portion comprising an external identifier common to a first data source and a second data source; and
a third portion comprising an internal identifier of the second data source;
wherein at least one data record of the plurality of data records comprises a warehouse internal identifier representing an internal identifier of the first data source, a common external identifier associated with the internal identifier of the first data source and an internal identifier of the second data source associated with the common external identifier; and
a processor adapted to execute contents of the memory. - View Dependent Claims (37, 38, 39, 40)
-
Specification