Techniques for dynamically assigning jobs to processors in a cluster based on broadcast information
First Claim
1. A method of operating a high performance computing cluster that includes multiple nodes each including multiple processors, comprising:
- periodically broadcasting information, related to processor utilization and network utilization at each of the multiple nodes, from each of the multiple nodes to remaining ones of the multiple nodes;
updating respective local job tables maintained in each of the multiple nodes based on the broadcast information; and
moving, based on the broadcast information in the respective local job tables, one or more threads from one or more of the multiple processors to a different one of the multiple processors.
1 Assignment
0 Petitions
Accused Products
Abstract
A technique for operating a high performance computing cluster (HPC) having multiple nodes (each of which include multiple processors) includes periodically broadcasting information, related to processor utilization and network utilization at each of the multiple nodes, from each of the multiple nodes to remaining ones of the multiple nodes. Respective local job tables maintained in each of the multiple nodes are updated based on the broadcast information. One or more threads are then moved from one or more of the multiple processors to a different one of the multiple processors (based on the broadcast information in the respective local job tables).
-
Citations
20 Claims
-
1. A method of operating a high performance computing cluster that includes multiple nodes each including multiple processors, comprising:
-
periodically broadcasting information, related to processor utilization and network utilization at each of the multiple nodes, from each of the multiple nodes to remaining ones of the multiple nodes; updating respective local job tables maintained in each of the multiple nodes based on the broadcast information; and moving, based on the broadcast information in the respective local job tables, one or more threads from one or more of the multiple processors to a different one of the multiple processors. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A high performance computing cluster, comprising:
-
multiple nodes each including multiple processors, wherein each of the multiple nodes is configured to periodically broadcast information, related to processor utilization and network utilization at each of the multiple nodes, from each of the multiple nodes to remaining ones of the multiple nodes; and monitoring hardware included in each of the multiple nodes, wherein the monitoring hardware is configured to update respective local job tables maintained in each of the multiple nodes based on the broadcast information, wherein the high performance computing cluster is configured to move, based on the broadcast information in the respective local job tables, one or more threads from one or more of the multiple processors to a different one of the multiple processors. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A method of operating a high performance computing cluster that includes multiple nodes each including multiple processors, comprising:
-
periodically broadcasting information, related to processor utilization and network utilization at each of the multiple nodes, from each of the multiple nodes to remaining ones of the multiple nodes; updating respective local job tables maintained in each of the multiple nodes based on the broadcast information; and moving, based on the broadcast information in the respective local job tables, one or more threads from one or more of the multiple processors to a different one of the multiple processors, wherein the information is broadcast using a message passing interface, and wherein at least some of the multiple processors are included within different multiple chip-level multiprocessors that are coupled together via host channel adapters and switch fabrics each including one or more switches.
-
Specification