System and method for inferencing of data transformations through pattern decomposition
First Claim
1. A method for use with a data integration or other computing environment comprising:
- providing, at a computer including a processor, a design-time system for creating software applications that perform data processing, wherein the design-time system includes;
a software development component having a graphical user interface for creation of data flows associated with the software applications, including specification of input hubs and output hubs comprising datasets that are data structures having attributes and associated with one or more of the hubs; and
a system hub that stores metadata associated with processing the data flows associated with the software applications, including functional and business data types;
wherein the software applications are deployed to a run-time system that executes the software applications, and that receives input from the design-time system;
accessing a data flow for each of one or more software applications that;
receive input data from one or more input hub sources of data, andpublish output data to one or more output hub destinations, according to the data flow associated with the one or more software applications;
processing the data flow for a first software application of the one or more software applications to generate one or more functional expressions representing the data flow for the first software application, wherein the one or more functional expressions are generated based on a determination of one or more semantic actions or rules identified in the data flow, including wherein;
an application represents a top level data flow transformation; and
an action represents an operator on one or more datasets;
identifying a pattern of transformation in the data flow for the first software application, as determined by the one or more functional expressions that are generated as representing the data flow for the first software application; and
subsequent to identifying the pattern of transformation identified in the data flow for the first software application, providing, as an output, a recommendation of one or more data transformations for incorporation within at least one of a modified data flow for the first software application or a data flow of a second software application;
wherein the pattern is used in displaying, at a graphical user interface, selected ones of the semantic actions enabled for the accessed data, for selection and use with the accessed data, including automatically providing or updating a list of the selected ones of the semantic actions enabled for the accessed data, during the processing of the accessed data.
4 Assignments
0 Petitions
Accused Products
Abstract
In accordance with various embodiments, described herein is a system (Data Artificial Intelligence system, Data AI system), for use with a data integration or other computing environment, that leverages machine learning (ML, DataFlow Machine Learning, DFML), for use in managing a flow of data (dataflow, DF), and building complex dataflow software applications (dataflow applications, pipelines). In accordance with an embodiment, the system can provide a service to recommend actions and transformations, on an input data, based on patterns identified from the functional decomposition of a data flow for a software application, including determining possible transformations of the data flow in subsequent applications. Data flows can be decomposed into a model describing transformations of data, predicates, and business rules applied to the data, and attributes used in the data flows.
-
Citations
17 Claims
-
1. A method for use with a data integration or other computing environment comprising:
-
providing, at a computer including a processor, a design-time system for creating software applications that perform data processing, wherein the design-time system includes; a software development component having a graphical user interface for creation of data flows associated with the software applications, including specification of input hubs and output hubs comprising datasets that are data structures having attributes and associated with one or more of the hubs; and a system hub that stores metadata associated with processing the data flows associated with the software applications, including functional and business data types; wherein the software applications are deployed to a run-time system that executes the software applications, and that receives input from the design-time system; accessing a data flow for each of one or more software applications that; receive input data from one or more input hub sources of data, and publish output data to one or more output hub destinations, according to the data flow associated with the one or more software applications; processing the data flow for a first software application of the one or more software applications to generate one or more functional expressions representing the data flow for the first software application, wherein the one or more functional expressions are generated based on a determination of one or more semantic actions or rules identified in the data flow, including wherein; an application represents a top level data flow transformation; and an action represents an operator on one or more datasets; identifying a pattern of transformation in the data flow for the first software application, as determined by the one or more functional expressions that are generated as representing the data flow for the first software application; and subsequent to identifying the pattern of transformation identified in the data flow for the first software application, providing, as an output, a recommendation of one or more data transformations for incorporation within at least one of a modified data flow for the first software application or a data flow of a second software application; wherein the pattern is used in displaying, at a graphical user interface, selected ones of the semantic actions enabled for the accessed data, for selection and use with the accessed data, including automatically providing or updating a list of the selected ones of the semantic actions enabled for the accessed data, during the processing of the accessed data. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A system for providing recommendations of actions and transformations on an input data for use with a data integration or other computing environment, comprising:
-
a design-time system for creating software applications that perform data processing, wherein the design-time system includes; a software development component having a graphical user interface for creation of data flows associated with the software applications, including specification of input hubs and output hubs comprising datasets that are data structures having attributes and associated with one or more of the hubs; and a system hub that stores metadata associated with processing the data flows associated with the software applications, including functional and business data types; wherein the software applications are deployed to a run-time system that executes the software applications, and that receives input from the design-time system; one or more processors operable to; access a data flow for each of one or more software applications that; receive input data from one or more input hub sources of data, and publish output data to one or more output hub destinations, according to the data flow associated with the one or more software applications; process the data flow for a first software application of the one or more software applications to generate one or more functional expressions representing the data flow for the first software application, wherein the one or more functional expressions are generated based on a determination of one or more semantic actions or business rules identified in the data flow, including wherein; an application represents a top level data flow transformation; and an action represents an operator on one or more datasets; identify a pattern of transformation in the data flow for the first software application, as determined by the one or more functional expressions that are generated as representing the data flow for the first software application; and subsequent to identifying the pattern of transformation identified in the data flow for the first software application, provide, as an output, a recommendation of one or more data transformations for incorporation within at least one of a modified data flow for the first software application or a data flow of a second software application; wherein the pattern is used in displaying, at a graphical user interface, selected ones of the semantic actions enabled for the accessed data, for selection and use with the accessed data, including automatically providing or updating a list of the selected ones of the semantic actions enabled for the accessed data, during the processing of the accessed data. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A non-transitory computer readable storage medium, including instructions stored thereon which when read and executed by one or more computers cause the one or more computers to perform a method comprising:
-
providing a design-time system for creating software applications that perform data processing, wherein the design-time system includes; a software development component having a graphical user interface for creation of data flows associated with the software applications, including specification of input hubs and output hubs comprising datasets that are data structures having attributes and associated with one or more of the hubs; and a system hub that stores metadata associated with processing the data flows associated with the software applications, including functional and business data types; wherein the software applications are deployed to a run-time system that executes the software applications, and that receives input from the design-time system; accessing a data flow for each of one or more software applications that; receive input data from one or more input hub sources of data, and publish output data to one or more output hub destinations, according to the data flow associated with the one or more software applications; processing the data flow for a first software application of the one or more software applications to generate one or more functional expressions representing the data flow for the first software application, wherein the one or more functional expressions are generated based on a determination of one or more semantic actions or rules identified in the data flow, including wherein; an application represents a top level data flow transformation; and an action represents an operator on one or more datasets; identifying a pattern of transformation in the data flow for the first software application, as determined by the one or more functional expressions that are generated as representing the data flow for the first software application; and subsequent to identifying the pattern of transformation identified in the data flow for the first software application, providing, as an output, a recommendation of one or more data transformations for incorporation within at least one of a modified data flow for the first software application or a data flow of a second software application; wherein the pattern is used in displaying, at a graphical user interface, selected ones of the semantic actions enabled for the accessed data, for selection and use with the accessed data, including automatically providing or updating a list of the selected ones of the semantic actions enabled for the accessed data, during the processing of the accessed data. - View Dependent Claims (14, 15, 16, 17)
-
Specification