System and method for balancing a computing load among computing resources in a distributed computing problem
First Claim
1. A method of balancing a computing load among a plurality of computing resources for handling an inter-connected problem in a task, comprising:
- determining computing capabilities of a plurality of computers, wherein the computers are heterogeneous, independent, non-dedicated, and connected in a loosely coupled, asynchronous, distributed communications grid for handling the inter-connected problem;
assigning a computing agent role to at least some of the computers for handling the inter-connected problem;
monitoring available resources for the computers that have been assigned a computing agent role;
reassigning original problem cells among the computers that have been assigned a computing agent role, as necessary to balance the computing load for handling the inter-connected problem;
wherein reassigning the original problem cells among the computers includes;
segmenting the computing load into a plurality of fixed sized collections of original program cells based on the complexity of the computing load;
grouping the fixed sized collections of original program cells into a plurality of variable problem partitions;
identifying computing agents having reduced computing resources available;
determining whether any of the computing agents are non-idle and have increased computing resources available;
reassigning variable problem partitions from computing agents having reduced computing resources available to non-idle computing agents having increased computing resources available responsive to identifying computing agents having reduced computing resources available and determining that a computing agent is non-idle and has increased computing resources available; and
dividing the monitored available resources among the respective execution agents, in order to improve computing load manageability.
1 Assignment
0 Petitions
Accused Products
Abstract
A distributed parallel computing system actively takes advantage of problem partitioning to balance the computing load among computing resources continually during processing. Variable problem partitions (VPPs) are initially defined as groups of original problem cells (OPCs). VPPs may be redefined and redistributed during execution, if necessary, to optimize performance based on the actual computing agent parameters and costs observed or reported through self-tests. For example, a good rule for efficient execution of a computing problem may be that the time required to perform a computation sequence (iteration) of all OPCs in a VPP should be comparable to the time required to share results via edge OPCs at the VPP collection perimeters. The rules that yield cost-efficient execution may be saved and re-used to generate initial partitionings for subsequent computing problem execution runs.
-
Citations
27 Claims
-
1. A method of balancing a computing load among a plurality of computing resources for handling an inter-connected problem in a task, comprising:
-
determining computing capabilities of a plurality of computers, wherein the computers are heterogeneous, independent, non-dedicated, and connected in a loosely coupled, asynchronous, distributed communications grid for handling the inter-connected problem; assigning a computing agent role to at least some of the computers for handling the inter-connected problem; monitoring available resources for the computers that have been assigned a computing agent role; reassigning original problem cells among the computers that have been assigned a computing agent role, as necessary to balance the computing load for handling the inter-connected problem; wherein reassigning the original problem cells among the computers includes; segmenting the computing load into a plurality of fixed sized collections of original program cells based on the complexity of the computing load; grouping the fixed sized collections of original program cells into a plurality of variable problem partitions; identifying computing agents having reduced computing resources available; determining whether any of the computing agents are non-idle and have increased computing resources available; reassigning variable problem partitions from computing agents having reduced computing resources available to non-idle computing agents having increased computing resources available responsive to identifying computing agents having reduced computing resources available and determining that a computing agent is non-idle and has increased computing resources available; and dividing the monitored available resources among the respective execution agents, in order to improve computing load manageability. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer system for balancing a computing load among a plurality of computing resources for handling an inter-connected problem in a task comprising:
-
means for determining computing capabilities of a plurality of computers, wherein the computers are heterogeneous, independent, non-dedicated, and connected in a loosely coupled, asynchronous, distributed communications grid for handling the inter-connected problem; means for assigning a computing agent role to at least some of the computers for handling the inter-connected problem; means for monitoring available resources for the computers that have been assigned a computing agent role; means for reassigning original problem cells among the computers that have been assigned a computing agent role, as necessary to balance the computing load for handling the inter-connected problem; wherein the means for reassigning the original problem cells among the computers includes; a device for segmenting the computing load into a plurality of fixed sized collections of original program cells based on the complexity of the computing load; a device for grouping the fixed sized collections of original program cells into a plurality of variable problem partitions; a device for identifying computing agents having reduced computing resources available; a device for determining whether any of the computing agents are non-idle and have increased computing resources available; a device for reassigning variable problem partitions from computing agents having reduced computing resources available to non-idle computing agents having increased computing resources available responsive to identifying computing agents having reduced computing resources available and determining that a computing agent is non-idle and has increased computing resources available; and a device for dividing the monitored available resources among the respective execution agents, in order to improve computing load manageability. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A computer program product having instruction codes stored on a computer readable medium, for balancing a computing load among a plurality of computing resources for handling an inter-connected problem in a task, comprising:
-
a first set of instruction codes for determining computing capabilities of a plurality of computers, wherein the computers are heterogeneous, independent, non-dedicated, and connected in a loosely coupled, asynchronous, distributed communications grid for handling the inter-connected problem; a second set of instruction codes for assigning a computing agent role to at least some of the computers for handling the inter-connected problem; a third set of instruction codes for monitoring available resources for the computers that have been assigned a computing agent role; and a fourth set of instruction codes for reassigning original problem cells among the computers that have been assigned a computing agent role, as necessary to balance the computing load for handling the inter-connected problem; wherein the fourth set of instruction codes for reassigning the original problem cells among the computers includes; a set of instruction codes for segmenting the computing load into a plurality of fixed sized collections of original program cells based on the complexity of the computing load; a set of instruction codes for grouping the fixed sized collections of original program cells into a plurality of variable problem partitions; a set of instruction codes for identifying computing agents having reduced computing resources available; a set of instruction codes for determining whether any of the computing agents are non-idle and have increased computing resources available; a set of instruction codes for reassigning variable problem partitions from computing agents having reduced computing resources available to non-idle computing agents having increased computing resources available responsive to identifying computing agents having reduced computing resources available and determining that a computing agent is non-idle and has increased computing resources available; and a set of instruction codes for dividing the monitored available resources among the respective execution agents, in order to improve computing load manageability. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27)
-
Specification