Data lineage system
First Claim
1. A non-transitory computer-readable medium having instructions stored thereon that, when executed by a processor, cause the processor to trace a data lineage of a data warehouse comprising one or more data tables, wherein each data table comprises one or more data records, the tracing comprising:
- mapping a target data element to one or more source data elements, wherein the target data element comprises a column of a target table definition, and wherein each source data element comprises a column of a source table definition;
extending the target table definition to comprise one or more auxiliary columns to store one or more source surrogate keys;
storing, for each source data element, a data lineage mapping system record within a data lineage mapping system table that represents the mapping of the target data element and a corresponding source data element together with a column identity of an auxiliary column in a target table that stores a source surrogate key;
storing one or more source surrogate key values within one or more auxiliary columns of a target data record at the time the target data record is created or modified;
storing, for each target data record, one or more shadow system records within a shadow system table that represents a mapping of a source data record and a corresponding target data record source surrogate key value;
wherein the data lineage comprises the one or more data lineage mapping system records, the one or more shadow system records, and the one or more source surrogate keys;
storing a filter column identity within a first auxiliary column of the target data record, wherein the filter column identity identifies a filter column of the source data record; and
storing a filter value within a second auxiliary column of the target data record, wherein the filter value is a filter value of the source data record.
1 Assignment
0 Petitions
Accused Products
Abstract
A data lineage system is provided that traces a data lineage of a data warehouse. The data lineage system maps a target data element to one or more source data elements. The data lineage system further stores one or more source surrogate keys within one or more auxiliary columns of a target data record. The data lineage system further stores, for each source data element, a data lineage mapping system record within a data lineage mapping system table that represents the mapping of the target data element and the corresponding source data element. The data lineage system further maps a source data element to one or more target data elements. The system further stores, for each target data element, a shadow system record within a shadow system table that represents the mapping of the source data element and the corresponding target data element.
44 Citations
26 Claims
-
1. A non-transitory computer-readable medium having instructions stored thereon that, when executed by a processor, cause the processor to trace a data lineage of a data warehouse comprising one or more data tables, wherein each data table comprises one or more data records, the tracing comprising:
-
mapping a target data element to one or more source data elements, wherein the target data element comprises a column of a target table definition, and wherein each source data element comprises a column of a source table definition; extending the target table definition to comprise one or more auxiliary columns to store one or more source surrogate keys; storing, for each source data element, a data lineage mapping system record within a data lineage mapping system table that represents the mapping of the target data element and a corresponding source data element together with a column identity of an auxiliary column in a target table that stores a source surrogate key; storing one or more source surrogate key values within one or more auxiliary columns of a target data record at the time the target data record is created or modified; storing, for each target data record, one or more shadow system records within a shadow system table that represents a mapping of a source data record and a corresponding target data record source surrogate key value; wherein the data lineage comprises the one or more data lineage mapping system records, the one or more shadow system records, and the one or more source surrogate keys; storing a filter column identity within a first auxiliary column of the target data record, wherein the filter column identity identifies a filter column of the source data record; and storing a filter value within a second auxiliary column of the target data record, wherein the filter value is a filter value of the source data record. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer-implemented method for tracing a data lineage of a data warehouse, the computer-implemented method comprising:
-
mapping a target data element to one or more source data elements, wherein the target data element comprises a column of a target table definition, and wherein each source data element comprises a column of a source table definition; extending the target table definition to comprise one or more auxiliary columns to store one or more source surrogate keys; storing, for each source data element, a data lineage mapping system record within a data lineage mapping system table that represents the mapping of the target data element and a corresponding source data element together with a column identity of an auxiliary column in a target table that stores a source surrogate key; storing one or more source surrogate key values within one or more auxiliary columns of a target data record at the time the target data record is created or modified; storing, for each target data record, one or more shadow system records within a shadow system table that represents a mapping of a source data record and a corresponding target data record source surrogate key value; wherein the data lineage comprises the one or more data lineage mapping system records, the one or more shadow system records, and the one or more source surrogate keys; storing a filter column identity within a first auxiliary column of the target data record, wherein the filter column identity identifies a filter column of the source data record; and storing a filter value within a second auxiliary column of the target data record, wherein the filter value is a filter value of the source data record. - View Dependent Claims (13, 14, 15, 16, 17)
-
-
18. A system for tracing a data lineage of a data warehouse, the system comprising:
-
a processor; a memory configured to store one or more instructions; a data lineage mapping module configured to map a target data element to one or more source data elements, wherein the target data element comprises a column of a target table definition, and wherein each source data element comprises a column of a source table definition; an extension module configured to extend the target table definition to comprise one or more auxiliary columns to store one or more source surrogate keys; and a data lineage storage module configured to store, for each source data element, a data lineage mapping system record within a data lineage mapping system table that represents the mapping of the target data element and a corresponding source data element together with a column identity of an auxiliary column in a target table that stores a source surrogate key; wherein the data lineage storage module is further configured to store one or more source surrogate key values within one or more auxiliary columns of a target data record at the time the target data record is created or modified; wherein the data lineage storage module is further configured to store, for each target data record, one or more shadow system records within a shadow system table that represents a mapping of a source data record and a corresponding target data record source surrogate key value; wherein the data lineage comprises the one or more data lineage mapping system records, the one or more shadow system records, and the one or more source surrogate keys; wherein the data lineage storage module is further configured to store a filter column identity within a first auxiliary column of the target data record, wherein the filter column identity identifies a filter column of the source data record; and wherein the data lineage storage module is further configured to store a filter value within a second auxiliary column of the target data record, wherein the filter value is a filter value of the source data record. - View Dependent Claims (19, 20, 21, 22, 23)
-
-
24. A non-transitory computer-readable medium having instructions stored thereon that, when executed by a processor, cause the processor to trace a data lineage of a data warehouse comprising one or more data tables, wherein each data table comprises one or more data records, the tracing comprising:
-
mapping a target data element to one or more source data elements, wherein the target data element comprises a column of a target data table definition, and wherein each source data element comprises a column of a source table definition; storing, for each source data element, a data lineage mapping system record within a data lineage mapping system table that represents the mapping of the target data element and a corresponding source data element; storing, for each target data record, one or more shadow system records within a shadow system table that represents a mapping of a source data record and a corresponding target data record; wherein the data lineage comprises the one or more data lineage mapping system records, and the one or more shadow system records; storing a filter column identity within a first auxiliary column of the target data record, wherein the filter column identity identifies a filter column of the source data record; and storing a filter value within a second auxiliary column of the target data record, wherein the filter value is a filter value of the source data record.
-
-
25. A computer-implemented method for tracing a data lineage of a data warehouse, the computer-implemented method comprising:
-
mapping a target data element to one or more source data elements, wherein the target data element comprises a column of a target data table definition, and wherein each source data element comprises a column of a source table definition; storing, for each source data element, a data lineage mapping system record within a data lineage mapping system table that represents the mapping of the target data element and a corresponding source data element; storing, for each target data record, one or more shadow system records within a shadow system table that represents a mapping of a source data record and a corresponding target data record; wherein the data lineage comprises the one or more data lineage mapping system records, and the one or more shadow system records; storing a filter column identity within a first auxiliary column of the target data record, wherein the filter column identity identifies a filter column of the source data record; and storing a filter value within a second auxiliary column of the target data record, wherein the filter value is a filter value of the source data record.
-
-
26. A system for tracing a data lineage of a data warehouse, the system comprising:
-
a processor; a memory configured to store one or more instructions; a data lineage mapping module configured to mapping a target data element to one or more source data elements, wherein the target data element comprises a column of a target data table definition, and wherein each source data element comprises a column of a source table definition; and a data lineage storage module configured to store, for each target data record, one or more shadow system records within a shadow system table that represents a mapping of the source data record and a corresponding target data record; wherein the data lineage storage module is further configured to store, for each target data record, one or more shadow system records within a shadow system table that represents a mapping of a source data record and a corresponding target data record; wherein the data lineage comprises the one or more data lineage mapping system records, and the one or more shadow system records; wherein the data lineage storage module is further configured to store a filter column identity within a first auxiliary column of the target data record, wherein the filter column identity identifies a filter column of the source data record; and wherein the data lineage storage module is further configured to store a filter value within a second auxiliary column of the target data record, wherein the filter value is a filter value of the source data record.
-
Specification