Data lineage data type
First Claim
1. A data structure for access by a computer programs, said data structure embodied in a computer readable medium, comprisinga table for use in a database, said table comprising a plurality of rows and columns said table storing information imported into said table from a source external to the database;
- a data lineage data type associated with each of said plurality of rows, said data lineage data type storing data indicative of the external source of said row.
3 Assignments
0 Petitions
Accused Products
Abstract
A system for tracking the lineage of data in a database. Data within the tables are tracked by attaching lineage information to the data, preferably, by adding a lineage identifier to each row in a table. Data that share a common lineage can be identified by virtue of sharing a common lineage identifier. The lineage identifier can then be used to trace the source of the data, i.e., data having a common identifier share a common history. Preferably, the lineage data type is an identifier that is universally unique and is optimized to provide little impact on the performance of the database. For example, by providing a sufficient size identifier to ensure its uniqueness while minimizing storage size. More preferably, the data lineage data type is a sixteen-byte number.
133 Citations
17 Claims
-
1. A data structure for access by a computer programs, said data structure embodied in a computer readable medium, comprising
a table for use in a database, said table comprising a plurality of rows and columns said table storing information imported into said table from a source external to the database; a data lineage data type associated with each of said plurality of rows, said data lineage data type storing data indicative of the external source of said row. - View Dependent Claims (2, 3, 4, 5, 6)
-
7. A method for tagging data in a relational database system, comprising:
-
providing a table of data organized in rows and columns;
importing data into said table from an external data source; and
providing a lineage transform that attaches an identifier to substantially every row in the table. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A computer-readable medium bearing a data structure accessible by a database management program for providing data lineage to data in a database, comprising:
-
a table comprising rows and columns said table storing information imported into said table from a source external to the database;
an identifier bound to each row by said database management program for identifying rows moved into the table from a common external data source. - View Dependent Claims (14, 15, 16, 17)
-
Specification