MACHINE LEARNING SYSTEM FLOW PROCESSING
First Claim
1. A computer-implemented method, comprising:
- initializing a workflow run in a machine learning system by identifying a text string defining a workflow;
traversing syntax of the text string to determine an interdependency graph of one or more data processing operator instances of the workflow;
generating an execution schedule of the workflow run based on the interdependency graph; and
managing execution of the workflow run in multiple computing environments according to the execution schedule; and
indexing an output of a data processing operator instance from among the data processing operator instances in a memoization repository, wherein the output is indexed as a result of processing an identifiable input through a data processing operator type associated with the data processing operator instance.
3 Assignments
0 Petitions
Accused Products
Abstract
Some embodiments include a method of machine learner workflow processing. For example, a workflow execution engine can receive an interdependency graph of operator instances for a workflow run. The operator instances can be associated with one or more operator types. The workflow execution engine can assign one or more computing environments from a candidate pool to execute the operator instances based on the interdependency graph. The workflow execution engine can generate a schedule plan of one or more execution requests associated with the operator instances. The workflow execution engine can distribute code packages associated the operator instances to the assigned computing environments. The workflow execution engine can maintain a memoization repository to cache one or more outputs of the operator instances upon completion of the execution requests.
-
Citations
20 Claims
-
1. A computer-implemented method, comprising:
-
initializing a workflow run in a machine learning system by identifying a text string defining a workflow; traversing syntax of the text string to determine an interdependency graph of one or more data processing operator instances of the workflow; generating an execution schedule of the workflow run based on the interdependency graph; and managing execution of the workflow run in multiple computing environments according to the execution schedule; and indexing an output of a data processing operator instance from among the data processing operator instances in a memoization repository, wherein the output is indexed as a result of processing an identifiable input through a data processing operator type associated with the data processing operator instance. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer readable data memory storing computer-executable instructions that, when executed by a computer system, cause the computer system to perform a computer-implemented method, the instructions comprising:
-
instructions for accessing an interdependency graph of data processing operator instances for a workflow run, the data processing operator instances associated with one or more data processing operator types; instructions for assigning one or more computing environments to execute the data processing operator instances based on the interdependency graph and resource constraints associated with the data processing operator instances; instructions for generating a schedule plan of one or more execution requests associated with the data processing operator instances to the assigned computing environments; and instructions for facilitating passing of an output from at least one of the data processing operator instances by caching the output in a memoization repository for access by another data processing operator instance. - View Dependent Claims (13, 14, 15, 16, 17, 18)
-
-
19. A machine learning system, comprising:
-
an operator repository configured to store one or more operator definitions; a workflow repository configured to store a workflow definition; and an execution scheduler engine configured to generate an interdependency graph defining a workflow run by matching input and output types of the operator definitions in the workflow definition, wherein the interdependency graph identifies one or more independent operator instances that are capable of parallel execution; wherein the execution scheduler engine is configured to generate an execution schedule of the workflow run, wherein the workflow run is scheduled for execution on multiple computing environments, wherein the independent operator instances are scheduled to be executed in parallel; and wherein the execution scheduler engine is configured to generate the execution schedule by maximizing parallel processing and memoization utilization. - View Dependent Claims (20)
-
Specification