Computation platform agnostic data classification workflows
First Claim
1. A computer-implemented method, comprising:
- defining a classification experiment for executing tasks on a computation platform by at least;
defining an input data space by selecting at least one of data sources interfaced with a classification platform system; and
defining, via a definition user interface of the classification platform system, a workflow configuration of the classification experiment by defining a directed graph (DG) connecting a plurality of transformation blocks to represent an experiment workflow, wherein the DG specifies one or more connections from one or more outputs of each transformation block in the experiment workflow to one or more other transformation blocks;
formatting the workflow configuration and the input data space into a data structure such that the data structure is interpretable by a plurality of different computation platforms to execute the classification experiment, wherein the plurality of different computation platforms have a plurality of different hardware configurations of system components, respectively;
selecting, from the plurality of different computation platforms, a distributed computation platform to execute at least part of the experiment workflow in accordance with the DG;
generating, based on a hardware configuration of system components associated with the distributed computation platform, an execution schedule that specifies which system component executes each part of the classification experiment; and
scheduling the distributed computation platform to execute the classification experiment by the system components under the execution schedule according to the input data space and the workflow configuration, wherein the system components are scheduled to execute their respective parts of the classification experiment as specified by the execution schedule.
2 Assignments
0 Petitions
Accused Products
Abstract
Various embodiments include a classification platform system. A user can define a classification experiment on the classification platform system. For example, the user can define an input data space by selecting at least one of data sources interfaced with the classification platform system and defining a workflow configuration including a directed graph (DG) connecting a plurality of transformation blocks to represent an experiment workflow. The DG can specify how one or more outputs of each of the transformation blocks are fed into one or more other transformation blocks. The DG can be executed by various types of computation platforms. The classification platform system can schedule the experiment workflow to be executed on a distributed computation platform according to the input data space and the workflow configuration.
41 Citations
21 Claims
-
1. A computer-implemented method, comprising:
-
defining a classification experiment for executing tasks on a computation platform by at least; defining an input data space by selecting at least one of data sources interfaced with a classification platform system; and defining, via a definition user interface of the classification platform system, a workflow configuration of the classification experiment by defining a directed graph (DG) connecting a plurality of transformation blocks to represent an experiment workflow, wherein the DG specifies one or more connections from one or more outputs of each transformation block in the experiment workflow to one or more other transformation blocks; formatting the workflow configuration and the input data space into a data structure such that the data structure is interpretable by a plurality of different computation platforms to execute the classification experiment, wherein the plurality of different computation platforms have a plurality of different hardware configurations of system components, respectively; selecting, from the plurality of different computation platforms, a distributed computation platform to execute at least part of the experiment workflow in accordance with the DG; generating, based on a hardware configuration of system components associated with the distributed computation platform, an execution schedule that specifies which system component executes each part of the classification experiment; and scheduling the distributed computation platform to execute the classification experiment by the system components under the execution schedule according to the input data space and the workflow configuration, wherein the system components are scheduled to execute their respective parts of the classification experiment as specified by the execution schedule. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A computer readable non-transitory data storage memory storing computer-executable instructions that, when executed by a computer system, cause the computer system to perform a computer-implemented method, the computer-executable instructions comprising:
-
instructions for defining a classification experiment for executing tasks on a computation platform by at least; defining an input data space by selecting at least one of data sources interfaced with a classification platform system; and defining, via a definition user interface of the classification platform system, a workflow configuration of the classification experiment by defining a directed graph (DG) connecting a plurality of transformation blocks to represent an experiment workflow, wherein the DG specifies one or more connections from one or more outputs of each transformation block in the experiment workflow to one or more other transformation blocks; instructions for formatting the workflow configuration and the input data space into a data structure such that the data structure is interpretable by a plurality of different computation platforms to execute the classification experiment, wherein the plurality of different computation platforms have a plurality of different hardware configurations of system components, respectively; instructions for selecting, from the plurality of different computation platforms, a distributed computation platform to execute at least part of the experiment workflow in accordance with the DG; instructions for generating, based on a hardware configuration of system components associated with the distributed computation platform, an execution schedule that specifies which system component executes each part of the classification experiment; and instructions for scheduling the distributed computation platform to execute the classification experiment by the system components under the execution schedule according to the input data space and the workflow configuration, wherein the system components are scheduled to execute their respective parts of the classification experiment as specified by the execution schedule.
-
-
21. A classification platform system, comprising:
-
a memory configured to store executable instructions; and a processor configured by the executable instructions to implement an experiment management engine and a workflow execution engine; wherein the experiment management engine is configured to defining a classification experiment for executing tasks on a computation platform by at least; defining an input data space by selecting at least one of data sources interfaced with a classification platform system; and defining, via a definition user interface of the classification platform system, a workflow configuration of the classification experiment by defining a directed graph (DG) connecting a plurality of transformation blocks to represent an experiment workflow, wherein the DG specifies one or more connections from one or more outputs of each transformation block in the experiment workflow to one or more other transformation blocks; wherein the workflow execution engine is configured to format the workflow configuration and the input data space into a data structure such that the data structure is interpretable by a plurality of different computation platforms to execute the classification experiment, wherein the plurality of different computation platforms have a plurality of different hardware configurations of system components, respectively; wherein the workflow execution engine is configured to select, from the plurality of different computation platforms, a distributed computation platform to execute at least part of the experiment workflow in accordance with the DG; wherein the workflow execution engine is configured to generate, based on a hardware configuration of system components associated with the distributed computation platform, an execution schedule that specifies which system component executes each part of the classification experiment; and wherein the workflow execution engine is configured to schedule the distributed computation platform to execute the classification experiment by the system components under the execution schedule according to the input data space and the workflow configuration, wherein the system components are scheduled to execute their respective parts of the classification experiment as specified by the execution schedule.
-
Specification