Holistic global performance and power management
First Claim
Patent Images
1. An apparatus comprising:
- logic, at least a portion of which is in hardware, wherein the logic is coupled to each node of a plurality of nodes to cause determination of a policy for power and performance management to transmit to the plurality of nodes,wherein the policy is to cause coordination of power and performance management across the plurality of nodes, wherein the policy is to manage a job to one or more objective functions, wherein the job is to comprise a plurality of tasks that are to run concurrently on the plurality of nodes, wherein the one or more objective functions are to comprise reduction of performance differences between the plurality of nodes while meeting a power cap, wherein the logic is to cause application of non-uniform power caps for each of plurality of nodes, wherein the logic is to operate in accordance with hierarchical machine learning operations, wherein the plurality of nodes is to form a cabinet, wherein the policy is decomposed hierarchically among one or more cabinets and then among the plurality of nodes in the job and across software and hardware abstraction layers.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods and apparatus to provide holistic global performance and power management are described. In an embodiment, logic (e.g., coupled to each compute node of a plurality of compute nodes) causes determination of a policy for power and performance management across the plurality of compute nodes. The policy is coordinated across the plurality of compute nodes to manage a job to one or more objective functions, where the job includes a plurality of tasks that are to run concurrently on the plurality of compute nodes. Other embodiments are also disclosed and claimed.
-
Citations
23 Claims
-
1. An apparatus comprising:
-
logic, at least a portion of which is in hardware, wherein the logic is coupled to each node of a plurality of nodes to cause determination of a policy for power and performance management to transmit to the plurality of nodes, wherein the policy is to cause coordination of power and performance management across the plurality of nodes, wherein the policy is to manage a job to one or more objective functions, wherein the job is to comprise a plurality of tasks that are to run concurrently on the plurality of nodes, wherein the one or more objective functions are to comprise reduction of performance differences between the plurality of nodes while meeting a power cap, wherein the logic is to cause application of non-uniform power caps for each of plurality of nodes, wherein the logic is to operate in accordance with hierarchical machine learning operations, wherein the plurality of nodes is to form a cabinet, wherein the policy is decomposed hierarchically among one or more cabinets and then among the plurality of nodes in the job and across software and hardware abstraction layers. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A method comprising:
-
causing, at logic, determination of a policy for power and performance management for each node of a plurality of nodes; and transmitting the policy to the plurality of nodes, wherein the policy causes coordination of power and performance management across the plurality of nodes, wherein the policy manages a job to one or more objective functions, wherein the job comprises a plurality of tasks that are to run concurrently on the plurality of nodes, wherein the one or more objective functions comprises reduction of performance differences between the plurality of nodes while meeting a power cap, wherein non-uniform power caps are applied for each of plurality of nodes, wherein the logic operates in accordance with hierarchical machine learning operations, wherein the plurality of nodes form a cabinet, wherein the policy is decomposed hierarchically among one or more cabinets and then among the plurality of nodes in the job and across software and hardware abstraction layers. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
-
21. A non-transitory computer-readable medium comprising one or more instructions that when executed on a processor configure the processor to perform one or more operations to:
-
cause, at logic, determination of a policy for power and performance management for each node of a plurality of nodes; and transmit the policy to the plurality of nodes, wherein the policy causes coordination of power and performance management across the plurality of nodes, wherein the policy manages a job to one or more objective functions, wherein the job comprises a plurality of tasks that are to run concurrently on the plurality of nodes, wherein the one or more objective functions comprises reduction of performance differences between the plurality of nodes while meeting a power cap, wherein non-uniform power caps are applied for each of plurality of nodes, wherein the logic is to operate in accordance with hierarchical machine learning operations, wherein the plurality of nodes is to form a cabinet, wherein the policy is decomposed hierarchically among one or more cabinets and then among the plurality of nodes in the job and across software and hardware abstraction layers. - View Dependent Claims (22, 23)
-
Specification