METHODS AND APPARATUS FOR RESOURCE MANAGEMENT IN CLUSTER COMPUTING
2 Assignments
0 Petitions
Accused Products
Abstract
Embodiments of an event-driven resource management technique may enable the management of cluster resources at a sub-computer level (e.g., at the thread level) and the decomposition of jobs at an atomic (task) level. A job queue may request a resource for a job from a resource manager, which may locate a resource in a resource list and grant the resource to the job queue. After the resource is granted, the job queue sends the job to the resource, on which the job may be partitioned into tasks and from which additional resources may be requested from the resource manager. The resource manager may locate additional resources in the list and grant the resources to the resource. The resource sends the tasks to the granted resources for execution. As resources complete their tasks, the resource manager is informed so that the status of the resources in the list can be updated.
35 Citations
40 Claims
-
1-20. -20. (canceled)
-
21. A method for tracking jobs performed by computing nodes of a cluster computing system, the method comprising:
-
monitoring, by a management system, a plurality of computing nodes and an availability of resources in the cluster computing system; identifying, by the management system, that at least one computing node of the plurality of computing nodes is available for performing a job submitted to a job queue; generating a job state object for tracking a job status of the job submitted to the job queue; and updating the job state object by a system independent of the management system upon completion of at least one task of the job state object. - View Dependent Claims (22, 23, 24, 25, 26, 27)
-
-
28. A system for tracking jobs performed by computing nodes of a cluster computing system, the system comprising:
-
a management system configured for; monitoring a plurality of computing nodes and an availability of resources in the cluster computing system, and identifying that at least one computing node of the plurality of computing nodes is available for performing a job submitted to a job queue; and at least one computing system in communication with and independent of the management system, the at least one system configured for; generating a job state object for tracking a job status of the job submitted to the job queue, and updating the job state object by a system the management system upon completion of at least one task of the job state object. - View Dependent Claims (29, 30, 31, 32, 33, 34)
-
-
35. A non-transitory computer-readable medium having program code stored thereon that is executable by a processor for tracking jobs performed by computing nodes of a cluster computing system, the program code comprising:
-
program code for monitoring, by a management system, a plurality of computing nodes and an availability of resources in the cluster computing system; program code for identifying, by the management system, that at least one computing node of the plurality of computing nodes is available for performing a job submitted to a job queue; program code for generating a job state object for tracking a job status of the job submitted to the job queue; and program code for updating the job state object by a system independent of the management system upon completion of at least one task of the job state object. - View Dependent Claims (36, 37, 38, 39, 40)
-
Specification