Machine learning system flow authoring tool
First Claim
Patent Images
1. A computer-implemented method, comprising:
- accessing a workflow text string that is a textual representation of a machine learning workflow, the machine learning workflow comprising an execution pipeline in a machine learning system for creating, modifying, evaluating, validating, or utilizing one or more machine learning models, and an operator text string associated with at least a data processing operator type, the textual representation of the machine learning workflow generated using a workflow definition language;
parsing the workflow text string to generate an interdependency graph of one or more data processing operators, the parsing generating an interdependency graph that is a cyclic directed graph of the one or more data processing operators, wherein at least one of the data processing operators is an instance of the data processing operator type, where parsing the workflow text string comprises;
traversing the workflow text string to match an output of a first data processing operator as an input of a second data processing operator, andupdating the interdependency graph to indicate that the second data processing operator depends on the output of the first data processing operator;
parsing the operator text string to identify operator attributes associated with the data processing operator type, wherein the operator attributes comprise an input schema and an output schema, wherein the input schema or the output schema includes a summary generation schema; and
wherein the operator text string identifies computational logic that is executable on a single host operating environment as a single unit;
scheduling the machine learning workflow for execution based on the interdependency graph; and
generating a summary of a resulting output or an input parameter to the machine learning workflow based the summary generation schema.
3 Assignments
0 Petitions
Accused Products
Abstract
Some embodiments include a workflow authoring tool that accesses a text string representation of a workflow and a text string representation of at least a data processing operator type. The workflow authoring tool enables definition of one or more data processing operator types that can be referenced in defining the machine learning workflow. When scheduling a workflow, the text string representation of the workflow can be parsed and traversed to generate an interdependency graph of one or more data processing operators. The text string representation of the data processing operator type can identify operator attributes associated with the data processing operator type.
-
Citations
18 Claims
-
1. A computer-implemented method, comprising:
-
accessing a workflow text string that is a textual representation of a machine learning workflow, the machine learning workflow comprising an execution pipeline in a machine learning system for creating, modifying, evaluating, validating, or utilizing one or more machine learning models, and an operator text string associated with at least a data processing operator type, the textual representation of the machine learning workflow generated using a workflow definition language; parsing the workflow text string to generate an interdependency graph of one or more data processing operators, the parsing generating an interdependency graph that is a cyclic directed graph of the one or more data processing operators, wherein at least one of the data processing operators is an instance of the data processing operator type, where parsing the workflow text string comprises; traversing the workflow text string to match an output of a first data processing operator as an input of a second data processing operator, and updating the interdependency graph to indicate that the second data processing operator depends on the output of the first data processing operator; parsing the operator text string to identify operator attributes associated with the data processing operator type, wherein the operator attributes comprise an input schema and an output schema, wherein the input schema or the output schema includes a summary generation schema; and
wherein the operator text string identifies computational logic that is executable on a single host operating environment as a single unit;scheduling the machine learning workflow for execution based on the interdependency graph; and generating a summary of a resulting output or an input parameter to the machine learning workflow based the summary generation schema. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A non-transitory computer readable data memory storing computer-executable instructions that, when executed by a computer system, cause the computer system to perform a computer-implemented method, the instructions comprising:
-
instructions for receiving an operator definition of an operator type associated with operator attributes that identify computational logics that is executable on a single host operating environment as a single unit, an input schema, and an output schema, wherein the input schema or the output schema includes a summary generation schema; instructions for receiving a text string representation that is a textual representation of a machine learning workflow including one or more references to one or more data processing operators of one or more operator types including the operator type, the textual representation of the machine learning workflow generated using a workflow definition language, the machine learning workflow comprising an execution pipeline in a machine learning system for creating, modifying, evaluating, validating, or utilizing one or more machine learning models; instructions for parsing the text string representation to generate an interdependency graph of one or more data processing operators, the parsing generating an interdependency graph that is an acyclic directed graph of the one or more data processing operators, the parsing comprising traversing through the text string representation of the workflow to determine a set of expected promises made between the operator types, wherein the expected promises indicate interdependencies between the operator types, the instructions for traversing comprising instructions for identifying a first data processing operator as being dependent on a second data processing operator by matching an output schema of the second data processing operator to an input schema of the first data processing operator; instructions for parsing the text string representation to generate an interdependency graph of one or more data processing operators, the parsing generating an interdependency graph that is a cyclic directed graph of the one or more data processing operators, the parsing comprising traversing through the text string representation of the workflow to determine a set of expected promises made between the operator types, wherein the expected promises indicate interdependencies between the operator types, the instructions for traversing comprising instructions for identifying a first data processing operator as being dependent on a second data processing operator by matching an output schema of the second data processing operator to an input schema of the first data processing operator; and instructions for scheduling execution of the workflow by at least assigning executing instances of the operator types to one or more computing environments and passing data between the computing environments based on the interdependencies.
-
-
18. A computer program product comprising a non-transitory computer readable storage medium having instructions encoded therein that, when executed by a processor cause the processor to:
-
receive an operator definition of a data processing operator type associated with operator attributes that identify computational logics that is executable on a single host operating environment as a single unit, an input schema, and an output schema, wherein the input schema or the output schema includes a summary generation schema; access a workflow text string that is a textual representation of a machine learning workflow, the textual representation of the machine learning workflow generated using a workflow definition language, the machine learning workflow comprising an execution pipeline in a machine learning system for creating, modifying, evaluating, validating, or utilizing one or more machine learning models, and an operator text string associated with at least a data processing operator type; parse the workflow text string to generate an interdependency graph of one or more data processing operators, the parse generating an interdependency graph that is an acyclic directed graph of the one or more data processing operators, wherein at least one of the data processing operators is an instance of the data processing operator type, where parsing the workflow text string comprises; traversing the workflow text string to match an output of a first data processing operator as an input of a second data processing operator, and updating the interdependency graph to indicate that the second data processing operator depends on the output of the first data processing operator; parse the workflow text string to generate an interdependency graph of one or more data processing operators, the parse generating an interdependency graph that is a cyclic directed graph of the one or more data processing operators, wherein at least one of the data processing operators is an instance of the data processing operator type, where parsing the workflow text string comprises; traversing the workflow text string to match an output of a first data processing operator as an input of a second data processing operator, and updating the interdependency graph to indicate that the second data processing operator depends on the output of the first data processing operator; parse the operator text string to identify operator attributes associated with the data processing operator type, wherein the operator attributes comprise an input schema and an output schema, wherein the input schema or the output schema includes a summary generation schema; and
wherein the operator text string identifies computational logic that is executable on a single host operating environment as a single unit;schedule the machine learning workflow for execution based on the interdependency graph; and generate a summary of a resulting output or an input parameter to the machine learning workflow based the summary generation schema.
-
Specification