Managing parallel execution of work granules according to their affinity
First Claim
1. A method for managing the assignment of a plurality of work granules to a plurality of processes on a plurality of nodes, the method comprising the steps of:
- making a determination that;
a portion of a first work granule of said plurality of work granules has an affinity to a particular node of said plurality of nodes, and another portion of said first work granule has an affinity to another node of said plurality of nodes;
based on said determination, generating first data that establishes said first work granule as having an affinity to a particular node of said plurality of nodes;
assigning each work granule from the plurality of work granules to a process from said plurality of processes based on said first data; and
wherein a portion of each work granule of said plurality of work granules has an affinity for a given node of said plurality of nodes when the portion of said each work granule can be executed more efficiently by said given node relative to another node of said plurality of nodes.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus are provided for managing work granules being executed in parallel. A task is evenly divided between a number of work granules. The number of work granules falls between a threshold minimum and a threshold maximum. The threshold minimum and maximum may be configured to balance a variety of efficiency factors affected by the number of work granules, including workload skew and overhead incurred in managing larger number of work granules. Work granules are distributed to processes on nodes according to which of the nodes, if any, may execute the work granule efficiently. A variety of factors may used to determine where a work granule may be performed efficiently, including whether data accessed during the execution of a work granule may be locally accessed by a node.
-
Citations
10 Claims
-
1. A method for managing the assignment of a plurality of work granules to a plurality of processes on a plurality of nodes, the method comprising the steps of:
-
making a determination that;
a portion of a first work granule of said plurality of work granules has an affinity to a particular node of said plurality of nodes, and another portion of said first work granule has an affinity to another node of said plurality of nodes;
based on said determination, generating first data that establishes said first work granule as having an affinity to a particular node of said plurality of nodes;
assigning each work granule from the plurality of work granules to a process from said plurality of processes based on said first data; and
wherein a portion of each work granule of said plurality of work granules has an affinity for a given node of said plurality of nodes when the portion of said each work granule can be executed more efficiently by said given node relative to another node of said plurality of nodes. - View Dependent Claims (2, 3, 4, 5)
generating second data that establishes as not having an affinity for any particular node of said plurality of nodes a second work granule that includes a portion of work which has an affinity for a node from said plurality of nodes, and another portion of work which has an affinity for another node from said plurality of nodes.
-
-
3. The method of claim 2, wherein the step of generating second data includes:
-
adding said first work granule to a first list that includes work granules established as having an affinity for said particular node; and
adding said second work granule to a second list that includes work granules established as having no affinity for any particular node from said plurality of nodes.
-
-
4. The method of claim 1, wherein the step of generating first data includes generating first data that establishes as having an affinity to a particular node from said plurality of nodes a particular work granule that specifies a plurality of database table partitions to scan.
-
5. The method of claim 1, further including the step of
determining that a quantity of certain data is accessed during execution of said first work granule; -
determining that a majority of said quantity of said certain data resides at a location local to said first node; and
wherein the step of establishing said first work granule as having an affinity to said particular node is performed in response to determining that a majority of said quantity of third data resides at a location local to said first node.
-
-
6. A method of assigning work granules to a plurality of processes on a plurality of nodes, the method comprising the steps of:
-
dividing a task into a plurality of work granules that includes a first set of work granules that each define work by specifying one or more ranges of blocks to scan, and a second set of work granules that each define work by specifying at least two database table partitions of a database table to scan;
generating first data that specifies, for each work granule of said plurality of work granules, whether said each work granule has an affinity for a particular node of said plurality of nodes;
assigning said plurality of work granules to said plurality of processes based on said first data; and
wherein said database table is partitioned into said at least two database table partitions by values in one or more columns of said-database table. - View Dependent Claims (7, 8)
at least one list that includes work granules having an affinity for a particular node from said plurality of nodes; and
another list that includes work granules having no particular affinity for a particular node.
-
-
8. The method of claim 7, wherein the step of assigning said plurality of work granules includes assigning to a first process on a first node from said plurality of nodes:
-
a work granule from said at least one list when said at least one list includes work granules that have an affinity for said first node and at least one work granule has not been assigned to a process; and
a work granule from said another list when said at least one list includes no work granule that has not been assigned to a process.
-
-
9. A computer-readable medium carrying one or more sequences of one or more instructions for managing assignment of a plurality of work granules to a plurality of processes on a plurality of nodes, the one or more sequences of one or more instructions including instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of:
-
making a determination that;
a portion of a first work granule of said plurality of work granules has an affinity to a particular node of said plurality of nodes, and another portion of said first work granule has an affinity to another node of said plurality of nodes;
based on said determination, generating first data that establishes said first work granule as having an affinity to a particular node of said plurality of nodes;
assigning each work granule from the plurality of work granules to a process from said plurality of processes based on said first data; and
wherein a portion of each work granule of said plurality of work granules has an affinity for a given node of said plurality of nodes when the portion of said each work granule can be executed more efficiently by said given node relative to another node of said plurality of nodes.
-
-
10. A computer-readable medium carrying one or more sequences of one or more instructions for assigning work granules to a plurality of processes on a plurality of nodes, the one or more sequences of one or more instructions including instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of:
-
dividing a task into a plurality of work granules that includes a first set of work granules that each define work by specifying one or more ranges of blocks to scan, and a second set of work granules that each define work by specifying at least two database table partitions of a database table to scan;
generating first data that specifies, for each work granule of said plurality of work granules, whether said each work granule has an affinity for a particular node of said plurality of nodes;
assigning said plurality of work granules to said plurality of processes based on said first data; and
wherein said database table is partitioned into said at least two database table partitions by values in one or more columns of said database table.
-
Specification