Auto-generation of code for performing a transform in an extract, transform, and load process
First Claim
1. A computer-implemented method for performing a transform in an extract, transform, and load process, the computer-implemented method comprising performing computer-implemented operations for:
- storing, by a computing device, a data model mapping that maps data types within a data source type system implemented by one or more data sources to elements within a data warehouse, the elements including dimension data types, relationship fact data types and outrigger data types;
generating, by the computing device, program code that performs a transform of data retrieved from the one or more data sources based on the data model mapping, the transform including a dimension data type transform, a relationship fact data type transform and an outrigger data type transform,where the program code initializes a plurality of watermarks, the watermarks indicating a date and a time at which a previous dimension transform ended; and
determining where a current dimension transform is to begin based on the plurality of watermarks,wheredimension data types include information regarding specific descriptive aspects of an organization,relationship fact data types include information regarding multiple associations of data between a plurality of dimension data types and tracks changes of the associations, andoutrigger data types include information regarding commonly associated data between at least two dimension data types,wherein the dimension data type transform further comprisesupdating, by the computing device, a destination dimension table attribute for a change in an existing element,inserting, by the computing device, new data into the destination dimension table for a new data object, andupdating, by the computing device, the destination dimension table by marking an element as deleted for a data object that has been deleted from a data source,wherein the relationship fact data type transform further comprisesdetermining, by the computing device, a relationship in a destination relationship table between data objects that need to be deleted,setting, by the computing device, a deleted time of a relationship to be deleted in the destination relationship table between objects to a time of creation of a new relationship between objects, andupdating, by the computing device, the destination relationship table by marking the relationship to be deleted based on the time of creation of the new relationship, andwherein the outrigger data type transform further includes inserting, by the computing device, new enumerations in a destination outrigger table, andupdating, by the computing device, existing enumerations in the destination outrigger table.
2 Assignments
0 Petitions
Accused Products
Abstract
A mapping is received and stored that maps elements of a data warehouse to types of a type system implemented by a data source. Program code is generated that performs a transform of data retrieved from a data source based on the mapping. Generation of the program code may include generating program code for performing a dimension transform based on the mapping, generating program code for performing a fact transform based on the mapping, and generating program code for performing an outrigger transform based on the mapping. The generated program code may then be executed to transform the data retrieved from the data source prior to loading into the data warehouse.
51 Citations
14 Claims
-
1. A computer-implemented method for performing a transform in an extract, transform, and load process, the computer-implemented method comprising performing computer-implemented operations for:
-
storing, by a computing device, a data model mapping that maps data types within a data source type system implemented by one or more data sources to elements within a data warehouse, the elements including dimension data types, relationship fact data types and outrigger data types; generating, by the computing device, program code that performs a transform of data retrieved from the one or more data sources based on the data model mapping, the transform including a dimension data type transform, a relationship fact data type transform and an outrigger data type transform, where the program code initializes a plurality of watermarks, the watermarks indicating a date and a time at which a previous dimension transform ended; and determining where a current dimension transform is to begin based on the plurality of watermarks, where dimension data types include information regarding specific descriptive aspects of an organization, relationship fact data types include information regarding multiple associations of data between a plurality of dimension data types and tracks changes of the associations, and outrigger data types include information regarding commonly associated data between at least two dimension data types, wherein the dimension data type transform further comprises updating, by the computing device, a destination dimension table attribute for a change in an existing element, inserting, by the computing device, new data into the destination dimension table for a new data object, and updating, by the computing device, the destination dimension table by marking an element as deleted for a data object that has been deleted from a data source, wherein the relationship fact data type transform further comprises determining, by the computing device, a relationship in a destination relationship table between data objects that need to be deleted, setting, by the computing device, a deleted time of a relationship to be deleted in the destination relationship table between objects to a time of creation of a new relationship between objects, and updating, by the computing device, the destination relationship table by marking the relationship to be deleted based on the time of creation of the new relationship, and wherein the outrigger data type transform further includes inserting, by the computing device, new enumerations in a destination outrigger table, and updating, by the computing device, existing enumerations in the destination outrigger table. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer-readable storage medium that is not a signal having computer-readable instructions stored thereupon which, when executed by a computer, cause the computer to:
-
store a data model mapping that maps data types within a data source type system implemented by a data source to elements within a data warehouse; generate program code that performs a transform of data retrieved from the data source based on the data model mapping, the transform including, a dimension data type transform, a relationship fact data type transform and an outrigger data type transform, where the program code initializes a plurality of watermarks, the watermarks indicating a date and a time at which a previous dimension transform ended; and determine where a current dimension transform is to begin based on the plurality of watermarks, where elements within the data warehouse include dimension data types include information regarding specific descriptive aspects of an organization, relationship fact data types include information regarding multiple associations of data between a plurality of dimension data types and tracks changes of the associations, and outrigger data types include information regarding commonly associated data between at least two dimension data types, wherein the dimension data type transform further comprises updating a destination dimension table attribute for a change in an existing element, inserting new data into the destination dimension table for a new data object, and updating the destination dimension table by marking an element as deleted for a data object that has been deleted from a data source, wherein the relationship fact data type transform further comprises determining a relationship in a destination relationship table between data objects that needs to be deleted, setting a deleted time of a relationship to be deleted in the destination relationship table between objects to a time of creation of a new relationship between objects, and updating the destination relationship table by marking the relationship to be deleted based on the time of creation of the new relationship, and wherein the outrigger data type transform further includes inserting new enumerations in a destination outrigger table, and updating existing enumerations in the destination outrigger table. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
14. A computer system for performing a transform in an extract, transform, and load process, the computer system comprising:
-
a central processing unit; and a memory storing program code executable on the central processing unit which, when executed, causes the central processing unit to store a data model mapping in the memory, the mapping comprising data mapping types within a data source type system implemented by a data source to elements within a data warehouse, to generate program code based on the data model mapping that performs a dimension data type transform, a relationship fact data type transform, and an outrigger data type transform of data retrieved from the data source, where the program code initializes a plurality of watermarks, the watermarks indicating a date and a time at which a previous dimension transform ended, to determine where a current dimension transform is to begin based on the plurality of watermarks, and to execute the generated program code on the central processing unit to transform the data retrieved from the data source prior to loading into the data warehouse, where the elements within the data warehouse include dimension data types include specific information regarding descriptive aspects of an organization, relationship fact data types include information regarding multiple associations of data between a plurality of dimension data types and tracks changes of the associations, and outrigger data types include information regarding commonly associated data between at least two dimension data types, wherein the dimension data type transform further comprises updating, by the computing device, a destination dimension table attribute for a change in an existing element, inserting, by the computing device, new data into the destination dimension table for a new data object, and updating, by the computing device, the destination dimension table by marking an element as deleted for a data object that has been deleted from a data source, wherein the relationship fact data type transform further comprises determining, by the computing device, a relationship in a destination relationship table between data objects that need to be deleted, setting, by the computing device, a deleted time of a relationship to be deleted in the destination relationship table between objects to a time of creation of a new relationship between objects, and updating, by the computing device, the destination relationship table by marking the relationship to be deleted based on the time of creation of the new relationship, and wherein the outrigger data type transform further includes; inserting, by the computing device, new enumerations in a destination outrigger table, and updating, by the computing device, existing enumerations in the destination outrigger table.
-
Specification