SELECTING PROCESSING TECHNIQUES FOR A DATA FLOW TASK
First Claim
Patent Images
1. A method for selecting processing techniques for a data flow task performed by a physical computing system, the method comprising:
- determining values for each of a set of parameters associated with a task within a data flow processing job; and
applying a set of rules to said values to determine one of a set of processing techniques that will be used to execute said task;
wherein said set of rules is determined through a set of benchmark tests for said task using each of said set of processing techniques while varying said set of parameters.
2 Assignments
0 Petitions
Accused Products
Abstract
A method for data flow processing includes determining values for each of a set of parameters associated with a task within a data flow processing job, and applying a set of rules to determine one of a set of processing techniques that will be used to execute the task. The set of rules is determined through a set of benchmark tests for the task using each of the set of processing techniques while varying the set of parameters.
-
Citations
15 Claims
-
1. A method for selecting processing techniques for a data flow task performed by a physical computing system, the method comprising:
-
determining values for each of a set of parameters associated with a task within a data flow processing job; and applying a set of rules to said values to determine one of a set of processing techniques that will be used to execute said task; wherein said set of rules is determined through a set of benchmark tests for said task using each of said set of processing techniques while varying said set of parameters. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method for transferring data from a local file system to a distributed file system for a data flow process, the method comprising:
-
on a node of a distributed computing system, changing metadata associated with data stored on a local file system of said node without copying said data to a distributed file system, said changed metadata indicating that said data is associated with said distributed file system; and indicating a presence of said data to a management function of said distributed file system. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
14. A distributed computing system comprising:
-
a node comprising; at least one processor; and a memory communicatively coupled to the at least one processor, the memory comprising computer executable code that, when executed by the at least one processor, causes the at least one processor to; change metadata associated with data stored on a local file system of said node without copying said data to a distributed file system, said changed metadata indicating that said data is associated with said distributed file system; and indicate a presence of said data to a management function of said distributed file system; wherein, a cost of transferring said data from said local file system to said distributed file system is used in part to define a set of rules used to determine which of a set of processing flows is to be used for a task of a data flow process based on parameters associated with said task and said distributed file system. - View Dependent Claims (15)
-
Specification