System and method for code generation from a directed acyclic graph using knowledge modules
First Claim
1. A method of generating code for data integration based on a logical design, comprising:
- generating, using a computer including a computer readable medium and processor, a physical design of a data integration process based on a logical design of the data integration process, wherein the physical design includes a plurality of execution units and the physical design is a physical implementation corresponding to physical devices;
assigning a knowledge module to each of a plurality of components in the plurality of execution units, wherein each knowledge module is specific to a type of component and a deployment'"'"'s language and technology type, and wherein the knowledge module is a code template and is configured to implement reusable transformation, and each knowledge module is customizable by a user;
generating code for each of the plurality of execution units in the physical design, based on the assigned knowledge modules, wherein the code that is generated is derived from declarative rules and metadata defined for each of the knowledge modules; and
executing the generated code in an order based on the physical design.
1 Assignment
0 Petitions
Accused Products
Abstract
In various embodiments, a data integration system is disclosed which enables users to create a logical design which is platform and technology independent. The user can create a logical design that defines, at a high level, how a user wants data to flow between sources and targets. The tool can analyze the logical design, in view of the user'"'"'s infrastructure, and create a physical design. The logical design can include a plurality of components corresponding to each source and target in the design, as well as operations such as joins or filters, and access points. Each component when transferred to the physical design generates code to perform operations on the data. Depending on the underlying technology (e.g., SQL Server, Oracle, Hadoop, etc.) and the language used (SQL, pig, etc.) the code generated by each component may be different.
48 Citations
25 Claims
-
1. A method of generating code for data integration based on a logical design, comprising:
-
generating, using a computer including a computer readable medium and processor, a physical design of a data integration process based on a logical design of the data integration process, wherein the physical design includes a plurality of execution units and the physical design is a physical implementation corresponding to physical devices; assigning a knowledge module to each of a plurality of components in the plurality of execution units, wherein each knowledge module is specific to a type of component and a deployment'"'"'s language and technology type, and wherein the knowledge module is a code template and is configured to implement reusable transformation, and each knowledge module is customizable by a user; generating code for each of the plurality of execution units in the physical design, based on the assigned knowledge modules, wherein the code that is generated is derived from declarative rules and metadata defined for each of the knowledge modules; and executing the generated code in an order based on the physical design. - View Dependent Claims (2, 3, 4, 5, 6, 7, 21, 22, 23, 24, 25)
-
-
8. A non-transitory computer readable storage medium including instructions stored thereon which, when executed by a processor cause the processor to perform the steps of:
-
generating, using a computer including a computer readable medium and processor, a physical design of a data integration process based on a logical design of the data integration process, wherein the physical design includes a plurality of execution units and the physical design is a physical implementation corresponding to physical devices; assigning a knowledge module to each of a plurality of components in the plurality of execution units, wherein each knowledge module is specific to a type of component and a deployment'"'"'s language and technology type, and wherein the knowledge module is a code template and is configured to implement reusable transformation, and each knowledge module is customizable by a user; generating code for each of the plurality of execution units in the physical design, based on the assigned knowledge modules, wherein the code that is generated is derived from declarative rules and metadata defined for each of the knowledge modules; and executing the generated code in an order based on the physical design. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A system of generating code for data integration based on a logical design, comprising:
a data integration system comprising one or more computing devices each including a computer readable medium and processor, wherein the data integration system is configured to generating a physical design of a data integration process based on a logical design of the data integration process, wherein the physical design includes a plurality of execution units and the physical design is a physical implementation corresponding to physical devices; assigning a knowledge module to each of a plurality of components in the plurality of execution units, wherein each knowledge module is specific to a type of component and a deployment'"'"'s language and technology type, and wherein the knowledge module is a code template and is configured to implement reusable transformation, and each knowledge module is customizable by a user; generating code for each of the plurality of components in the physical design, based on the assigned knowledge modules, wherein the code that is generated is derived from declarative rules and metadata defined for each of the knowledge modules; and executing the generated code in an order based on the physical design. - View Dependent Claims (16, 17, 18, 19, 20)
Specification