EVENT-BASED DYNAMIC RESOURCE PROVISIONING
First Claim
1. A computer-implemented method for operating a supercomputing system, comprising:
- processing a first supercomputing job with a first amount of resources of the supercomputing system;
determining that a first event occurred while processing a data set of the first supercomputing job;
in response to determining that the first event occurred;
determining a first amount of additional resources of the supercomputing system based on a first resolution, a second resolution, a size of the data set, and a target completion time for the first supercomputing job;
allocating the first amount of additional resources of the supercomputing system;
distributing at least a portion of the data set to the first additional computing resources; and
processing the first supercomputing job with the first amount of resources of the supercomputing system and the first amount of additional resources of the supercomputing system.
2 Assignments
0 Petitions
Accused Products
Abstract
Disclosed are a method, a system and a computer program product for automatically allocating and de-allocating resources for jobs executed or processed by one or more supercomputer systems. In one or more embodiments, a supercomputing system can process multiple jobs with respective supercomputing resources. A global resource manager can automatically allocate additional resources to a first job and de-allocate resources from a second job. In one or more embodiments, the global resource manager can provide the de-allocated resources to the first job as additional supercomputing resources. In one or more embodiments, the first job can use the additional supercomputing resources to perform data analysis at a higher resolution, and the additional resources can compensate for an amount of time the higher resolution analysis would take using originally allocated supercomputing resources.
-
Citations
20 Claims
-
1. A computer-implemented method for operating a supercomputing system, comprising:
-
processing a first supercomputing job with a first amount of resources of the supercomputing system; determining that a first event occurred while processing a data set of the first supercomputing job; in response to determining that the first event occurred; determining a first amount of additional resources of the supercomputing system based on a first resolution, a second resolution, a size of the data set, and a target completion time for the first supercomputing job; allocating the first amount of additional resources of the supercomputing system; distributing at least a portion of the data set to the first additional computing resources; and processing the first supercomputing job with the first amount of resources of the supercomputing system and the first amount of additional resources of the supercomputing system. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A supercomputing system, comprising:
-
a plurality of compute nodes, wherein each of the plurality of compute nodes is coupled to another of the plurality of compute nodes; a first memory medium coupled to at least a first compute node of the plurality of compute nodes, wherein the first memory medium includes instructions that when executed on the first compute node provides logic for performing the functions of processing a first supercomputing job with a first amount of resources of the supercomputing system; determining that a first event occurred while processing a data set of the first supercomputing job; in response to determining that the first event occurred; determining a first amount of additional resources of the supercomputing system based on a first resolution, a second resolution, a size of the data set, and a target completion time for the first supercomputing job; allocating the first amount of additional resources of the supercomputing system; distributing at least a portion of the data set to the first additional computing resources; and processing the first supercomputing job with the first amount of resources of the supercomputing system and the first amount of additional resources of the supercomputing system. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A computer readable memory medium comprising instructions, which when executed on a processing system of a supercomputing system, cause the supercomputing system to perform:
-
processing a first supercomputing job with a first amount of resources of the supercomputing system; determining that a first event occurred while processing a data set of the first supercomputing job; in response to determining that the first event occurred; determining a first amount of additional resources of the supercomputing system based on a first resolution, a second resolution, a size of the data set, and a target completion time for the first supercomputing job; allocating the first amount of additional resources of the supercomputing system; distributing at least a portion of the data set to the first additional computing resources; and processing the first supercomputing job with the first amount of resources of the supercomputing system and the first amount of additional resources of the supercomputing system. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification