Systems and methods for scheduling data flow execution based on an arbitrary graph describing the desired data flow
First Claim
1. A computer system configured to effectuate data transformation service comprising:
- a computer readable storage medium coupled to a processor, wherein the computer readable storage medium includes instructions for;
a data retrieval system to receive data from a source;
a data transformation pipeline comprising;
a plurality of component objects;
a graphical user interface by which a user can diagrammatically represent a data transformation as a series of interconnected nodes in a graph, each node depicted as a graphical representation and representing predefined data transformation functionality and corresponding to a component object from among the plurality of components objects, each node interconnected to another node by way of a graphical representation of an edge wherein the edge represent the data flow between nodes;
an interpreter that traverses the series of interconnected nodes in the graph and translates the graph into a data flow execution plan, said data flow execution plan for obtaining the data, transforming the data, and releasing the data, and at least one work list, said list comprising at least one work item;
a pipeline engine to build the data flow execution based on the data flow execution plan, said data flow execution comprising a set of components instantiated from the plurality of component objects; and
a scheduler that executes at least one work item in at least one work list;
a destination data storage system to store data.
2 Assignments
0 Petitions
Accused Products
Abstract
The data transformation system in one embodiment, comprises a capability to receive data, a data destination and a capability to store transformed data, and a data transformation pipeline that constructs complex end-to-end data transformation functionality by pipelining data flowing from one or more sources to one or more destinations through various interconnected nodes for transforming the data as it flows. Each component in the pipeline possesses predefined data transformation functionality, and the logical connections between components define the data flow pathway in an operational sense.
The data transformation pipeline enables a user to develop complex end-to-end data transformation functionality by graphically describing and representing, via a GUI,a desired data flow from one or more sources to one or more destinations through various interconnected nodes (graph). Each node in the graph selected by the user represents predefined data transformation functionality, and connections between nodes define the data flow pathway.
46 Citations
24 Claims
-
1. A computer system configured to effectuate data transformation service comprising:
-
a computer readable storage medium coupled to a processor, wherein the computer readable storage medium includes instructions for; a data retrieval system to receive data from a source; a data transformation pipeline comprising; a plurality of component objects; a graphical user interface by which a user can diagrammatically represent a data transformation as a series of interconnected nodes in a graph, each node depicted as a graphical representation and representing predefined data transformation functionality and corresponding to a component object from among the plurality of components objects, each node interconnected to another node by way of a graphical representation of an edge wherein the edge represent the data flow between nodes; an interpreter that traverses the series of interconnected nodes in the graph and translates the graph into a data flow execution plan, said data flow execution plan for obtaining the data, transforming the data, and releasing the data, and at least one work list, said list comprising at least one work item; a pipeline engine to build the data flow execution based on the data flow execution plan, said data flow execution comprising a set of components instantiated from the plurality of component objects; and a scheduler that executes at least one work item in at least one work list; a destination data storage system to store data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A computer readable storage medium where the medium excludes signals, for effectuating a data transformation pipeline comprising instructions for:
-
a plurality of component objects; a graphical user interface by which a user can diagrammatically represent a data transformation as a series of interconnected nodes in a graph, each node depicted as a graphical representation and representing predefined data transformation functionality and corresponding to a component object from among the plurality of components objects, each node interconnected to another node by way of a graphical representation of an edge wherein the edge represent the data flow between nodes; an interpreter that traverses the graph and translates the series of interconnected nodes in the graph into a data flow execution plan and at least one work list, said data flow execution plan for obtaining the data, transforming the data, and releasing the data, said list comprising at least one work item; a pipeline engine to build the data flow execution based on the data flow execution plan, said data flow execution comprising a set of components instantiated from the plurality of component objects; and a scheduler that executes at least one work item in at least one work list. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
-
Specification