Support of non-trivial scheduling policies along with topological properties
First Claim
1. A computer system having a scheduling system to schedule a job having resource mapping requirements to resources in a computing architecture arranged at least in part on node boards in host computers, each node board having at least one central processor unit (CPU) and shared memory, said node boards being interconnected into groups of node boards providing access between the central processing units (CPUs) and shared memory on different node boards, said computer system comprising:
- a processor for executing computing instructions;
memory for storing said computing instructions;
a scheduling system associated with the processor and the memory, and comprising;
a scheduling unit for scheduling jobs to at least some of said resources, said scheduling unit generating a candidate host list representing the resources available to execute the job to be scheduled based on resource requirements of the job to be scheduled;
a topology library unit comprising a machine map M of the computer system, said machine map M indicative of the interconnections of the resources to which the scheduling system can schedule the jobs;
a topology monitoring unit for monitoring a status of the resources and generating status information signals indicative of a status of the resources;
wherein the topology library unit receives the status information signals and the candidate host list and determines a free map F of resources to execute the job to be scheduled, said free map F indicative of the interconnection of the resources to which the job in a current scheduling cycle can be scheduled based on the status information signals, the candidate host list and the machine map M;
wherein the topology monitoring unit dispatches a job to the resources in the free map F which match the resource mapping requirements of the job;
wherein the jobs to be scheduled each have a priority rating and wherein the scheduling unit includes priority status information signals indicative of the priority of the job which is being executed by the jobs that have been scheduled in previous scheduling cycles but have not yet been completed; and
wherein the scheduling unit includes in the candidate host list any resources which are executing jobs having a lower priority than the priority of the job being scheduled in the current scheduling cycle and excludes from the candidate host list any resources which are executing a job having a priority that is higher than the priority of the job being scheduled in the current scheduling cycle.
5 Assignments
0 Petitions
Accused Products
Abstract
A system and method for scheduling jobs in a multiprocessor machine is disclosed. The status of resources, including CPUs on node boards and associated shared memory in the multiprocessor machine is periodically determined. The status can indicate the resources available to execute jobs. This information is accumulated by the topology-monitoring unit and provided to the topology library. The topology library also receives a candidate host list from the scheduling unit which lists all of the resources available to execute the job being scheduled based on non-trivial scheduling. The topology library unit then uses this to generate a free map F indicative of the interconnection of the resources available to execute the job. The topology monitoring unit then matches the jobs to the resources available to execute the jobs, based on resource requirements including shape requirements indicative of interconnections of resources required to execute the job. The topology monitoring unit dispatches the job to the portion of the free map F which match the shape requirements of the job. If the topology library unit determines that no resources are available to execute the job, the topology library unit will return the job to the scheduling unit and the scheduling unit which will wait until the resources become available. The free map F may include resources which have been suspended or reserved in previous scheduling cycles, provided the job to be scheduled satisfies the predetermined criteria for execution of the job on the suspended, have a lower priority, or are reserved resources.
8 Citations
13 Claims
-
1. A computer system having a scheduling system to schedule a job having resource mapping requirements to resources in a computing architecture arranged at least in part on node boards in host computers, each node board having at least one central processor unit (CPU) and shared memory, said node boards being interconnected into groups of node boards providing access between the central processing units (CPUs) and shared memory on different node boards, said computer system comprising:
-
a processor for executing computing instructions; memory for storing said computing instructions; a scheduling system associated with the processor and the memory, and comprising; a scheduling unit for scheduling jobs to at least some of said resources, said scheduling unit generating a candidate host list representing the resources available to execute the job to be scheduled based on resource requirements of the job to be scheduled; a topology library unit comprising a machine map M of the computer system, said machine map M indicative of the interconnections of the resources to which the scheduling system can schedule the jobs; a topology monitoring unit for monitoring a status of the resources and generating status information signals indicative of a status of the resources; wherein the topology library unit receives the status information signals and the candidate host list and determines a free map F of resources to execute the job to be scheduled, said free map F indicative of the interconnection of the resources to which the job in a current scheduling cycle can be scheduled based on the status information signals, the candidate host list and the machine map M; wherein the topology monitoring unit dispatches a job to the resources in the free map F which match the resource mapping requirements of the job; wherein the jobs to be scheduled each have a priority rating and wherein the scheduling unit includes priority status information signals indicative of the priority of the job which is being executed by the jobs that have been scheduled in previous scheduling cycles but have not yet been completed; and wherein the scheduling unit includes in the candidate host list any resources which are executing jobs having a lower priority than the priority of the job being scheduled in the current scheduling cycle and excludes from the candidate host list any resources which are executing a job having a priority that is higher than the priority of the job being scheduled in the current scheduling cycle. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. In a computer system comprising resources arranged at least in part in on nodes boards, each node board having at least one central processor unit (CPU) and shared memory, said node boards being interconnected to provide access between the central processing units (CPUs) and shared memory on different boards, a method of scheduling a job to said resources comprising:
-
(a) determining a machine map M of the computer system indicative of all of the interconnections of all of the resources in the computer system to which a scheduling system can schedule jobs and storing the machine map M in a topology library unit, wherein the jobs each have a priority rating; (b) periodically assessing a status of the resources and sending status information signals indicative of the status of the resources to the topology library unit; (c) assessing at the topology monitoring unit a free map F of resources indicative of the interconnection of all of resources to which the scheduling unit can schedule a job in a current scheduling cycle; (d) assessing priority status information signals indicative of the priority of jobs being executed by the jobs that have been scheduled in previous scheduling cycles but have not yet been completed; (e) including in the free map F any resources which are executing jobs having a lower priority than the priority of the job currently being scheduled and excluding from the free map F any resources which are executing a job having a priority that is higher than the priority of the job being scheduled; (f) matching resource requirements, including topological requirements specifying at least one interconnection of the resources required to execute the job currently being scheduled, to resources in the free map F which match the resource requirements of the job; and (g) dispatching the job to the matched resources.
-
Specification