Topology-aware fabric-based offloading of collective functions
First Claim
1. A computing method, comprising:
- providing a computing system that includes a plurality of compute nodes interconnected by a communication network, formed of network switching elements;
wherein the switching elements are separate from the compute nodes;
accepting, by a processor serving as an offload manager, a notification of a computing task for execution by the computing system, the notification specifying a designated partial group of the compute nodes assigned by a job scheduler, separate from the offload manager, to execute the computing task;
identifying, by the processor, based on the given interconnection topology and on a criterion derived from the computing task, a subset of the network switching elements to be configured to connect the compute nodes in the designated partial group to one or more root switching elements; and
configuring the network switching elements in the subset, by the processor, to forward node-level results of the computing task produced by the compute nodes in the designated partial group to the one or more root switching elements through the subset, so as to cause the one or more root switching elements to calculate and output an end result of the computing task based on the node-level results.
5 Assignments
0 Petitions
Accused Products
Abstract
A computing method includes accepting a notification of a computing task for execution by a group of compute nodes interconnected by a communication network, which has a given interconnection topology and includes network switching elements. A set of preferred paths, which connect the compute nodes in the group via at least a subset of the network switching elements to one or more root switching elements, are identified in the communication network based on the given interconnection topology and on a criterion derived from the computing task. The network switching elements in the subset are configured to forward node-level results of the computing task produced by the compute nodes in the group to the root switching elements over the preferred paths, so as to cause the root switching elements to calculate and output an end result of the computing task based on the node-level results.
-
Citations
21 Claims
-
1. A computing method, comprising:
-
providing a computing system that includes a plurality of compute nodes interconnected by a communication network, formed of network switching elements; wherein the switching elements are separate from the compute nodes; accepting, by a processor serving as an offload manager, a notification of a computing task for execution by the computing system, the notification specifying a designated partial group of the compute nodes assigned by a job scheduler, separate from the offload manager, to execute the computing task; identifying, by the processor, based on the given interconnection topology and on a criterion derived from the computing task, a subset of the network switching elements to be configured to connect the compute nodes in the designated partial group to one or more root switching elements; and configuring the network switching elements in the subset, by the processor, to forward node-level results of the computing task produced by the compute nodes in the designated partial group to the one or more root switching elements through the subset, so as to cause the one or more root switching elements to calculate and output an end result of the computing task based on the node-level results. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 18, 19)
-
-
11. A computing apparatus, comprising:
-
an interface coupled to communicate with a communication network, which includes network switching elements that interconnect a plurality of compute nodes according to a given interconnection topology; wherein the switching elements are separate from the compute nodes; and a processor, serving as an offload manager, which is configured to accept a notification of a computing task for execution, the notification specifying a designated partial group of the compute nodes assigned by a job scheduler, separate from the offload manager, to execute the computing task, to identify, based on the given interconnection topology and on a criterion derived from the computing task, a subset of the network switching elements to be configured to connect the compute nodes in the designated partial group to one or more root switching elements, and to configure the network switching elements in the subset to forward node-level results of the computing task produced by the compute nodes in the designated partial group to the one or more root switching elements through the subset, so as to cause the one or more root switching elements to calculate and output an end result of the computing task based on the node-level results. - View Dependent Claims (12, 13, 14, 15, 16, 17)
-
-
20. A computing method, comprising:
-
providing a computing system that includes a plurality of compute nodes interconnected by a communication network, which has a given interconnection topology and includes network switching elements; wherein the switching elements are separate from the compute nodes; accepting, by a processor serving as an offload manager, from the network switching elements, information on the interconnection topology of the computing system; accepting, by the processor, a notification of a reduction function for execution by a designated partial group of the compute nodes assigned by a job scheduler, separate from the offload manager, to execute the computing task; identifying, by the processor, based on the given interconnection topology and on a criterion derived from the computing task, a subset of the network switching elements to be configured to connect the compute nodes in the designated partial group to one or more root switching elements; and configuring the network switching elements in the subset, by the processor, to forward node-level results of the computing task produced by the compute nodes in the designated partial group to the one or more root switching elements through the subset, so as to cause the root switching elements to calculate and output an end result of the computing task based on the node-level results. - View Dependent Claims (21)
-
Specification