STREAM-BASED ACCELERATOR PROCESSING OF COMPUTATIONAL GRAPHS
First Claim
1. A method comprising:
- receiving, by a computational graph system, a request to process a computational graph;
obtaining data representing a subgraph of the computational graph, the computational graph comprising a plurality of nodes and directed edges, wherein each node represents a respective operation, wherein each directed edge connects a respective first node to a respective second node that represents an operation that receives, as input, an output of an operation represented by the respective first node, the subgraph assigned to a first device by a placer in the computational graph system;
determining that the first device comprises a hardware accelerator having a plurality of streams;
in response to determining that the first device comprises a hardware accelerator having a plurality of streams, generating instructions that when executed by the first device cause the first device to;
assign the operation represented by each node in the subgraph to a respective stream in the plurality of streams of the hardware accelerator; and
perform the operations represented by the nodes in the subgraph in accordance with the assignment; and
providing the instructions and the data to the first device.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for receiving, by a computational graph system, a request to process a computational graph; obtaining data representing a subgraph of the computational graph, the computational graph comprising a plurality of nodes and directed edges, wherein each node represents a respective operation, wherein each directed edge connects a respective first node to a respective second node, the subgraph assigned to a first device by a placer in the computational graph system; determining that the first device comprises a hardware accelerator having a plurality of streams; in response to determining, generating instructions that when executed by the first device cause the first device to: assign the operation represented by each node in the subgraph to a respective stream; and perform the operations represented by the nodes in the subgraph in accordance with the assignment.
-
Citations
40 Claims
-
1. A method comprising:
-
receiving, by a computational graph system, a request to process a computational graph; obtaining data representing a subgraph of the computational graph, the computational graph comprising a plurality of nodes and directed edges, wherein each node represents a respective operation, wherein each directed edge connects a respective first node to a respective second node that represents an operation that receives, as input, an output of an operation represented by the respective first node, the subgraph assigned to a first device by a placer in the computational graph system; determining that the first device comprises a hardware accelerator having a plurality of streams; in response to determining that the first device comprises a hardware accelerator having a plurality of streams, generating instructions that when executed by the first device cause the first device to; assign the operation represented by each node in the subgraph to a respective stream in the plurality of streams of the hardware accelerator; and perform the operations represented by the nodes in the subgraph in accordance with the assignment; and providing the instructions and the data to the first device. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16, 18, 19, 20, 21, 22, 24, 25, 26, 27, 28, 30, 31, 32, 33, 34, 36, 37, 38, 39, 40)
-
-
11. A system comprising:
-
one or more computers; and computer-readable medium coupled to the one or more computers and having instructions stored thereon, which, when executed by the one or more computers, cause the one or more computers to, for each of the neural network layers, perform operations comprising; receiving, by a computational graph system, a request to process a computational graph; obtaining data representing a subgraph of the computational graph, the computational graph comprising a plurality of nodes and directed edges, wherein each node represents a respective operation, wherein each directed edge connects a respective first node to a respective second node that represents an operation that receives, as input, an output of an operation represented by the respective first node, the subgraph assigned to a first device by a placer in the computational graph system; determining that the first device comprises a hardware accelerator having a plurality of streams; in response to determining that the first device comprises a hardware accelerator having a plurality of streams, generating instructions that when executed by the first device cause the first device to; assign the operation represented by each node in the subgraph to a respective stream in the plurality of streams of the hardware accelerator; and perform the operations represented by the nodes in the subgraph in accordance with the assignment; and providing the instructions and the data to the first device.
-
-
17. A computer program product encoded on one or more non-transitory computer storage media, the computer program product comprising instructions that when executed by a hardware accelerator having a plurality of streams cause the hardware accelerator to perform operations comprising:
-
receiving, by a computational graph system, a request to process a computational graph; obtaining data representing a subgraph of the computational graph, the computational graph comprising a plurality of nodes and directed edges, wherein each node represents a respective operation, wherein each directed edge connects a respective first node to a respective second node that represents an operation that receives, as input, an output of an operation represented by the respective first node, the subgraph assigned to a first device by a placer in the computational graph system; determining that the first device comprises a hardware accelerator having a plurality of streams; in response to determining that the first device comprises a hardware accelerator having a plurality of streams, generating instructions that when executed by the first device cause the first device to; assign the operation represented by each node in the subgraph to a respective stream in the plurality of streams of the hardware accelerator; and perform the operations represented by the nodes in the subgraph in accordance with the assignment; and providing the instructions and the data to the first device.
-
-
23. A method comprising:
-
receiving, by a hardware accelerator having a plurality of streams, data representing a subgraph of a computational graph, the computational graph comprising a plurality of nodes and directed edges, wherein each node represents a respective operation, wherein each directed edge connects a respective first node to a respective second node that represents an operation that receives, as input, an output of an operation represented by the respective first node, the subgraph assigned to a hardware accelerator by a placer in a computational graph system; assigning, by the hardware accelerator, the operation represented by each node in the subgraph to a respective stream in the plurality of streams of the hardware accelerator; and performing, by the hardware accelerator, the operations represented by the nodes in the subgraph in accordance with the assignment.
-
-
29. A computer program product encoded on one or more non-transitory computer storage media, the computer program product comprising instructions that, when executed by a computer comprising a hardware accelerator having a plurality of streams, cause the computer to perform operations comprising:
-
receiving data representing a subgraph of a computational graph, the computational graph comprising a plurality of nodes and directed edges, wherein each node represents a respective operation, wherein each directed edge connects a respective first node to a respective second node that represents an operation that receives, as input, an output of an operation represented by the respective first node, the subgraph assigned to a hardware accelerator by a placer in a computational graph system; assigning the operation represented by each node in the subgraph to a respective stream in the plurality of streams of the hardware accelerator; and performing the operations represented by the nodes in the subgraph in accordance with the assignment.
-
-
35. A system comprising a hardware accelerator, wherein the hardware accelerator comprises a plurality of streams, and wherein the system is configured to perform operations comprising:
-
receiving data representing a subgraph of a computational graph, the computational graph comprising a plurality of nodes and directed edges, wherein each node represents a respective operation, wherein each directed edge connects a respective first node to a respective second node that represents an operation that receives, as input, an output of an operation represented by the respective first node, the subgraph assigned to a hardware accelerator by a placer in a computational graph system; assigning the operation represented by each node in the subgraph to a respective stream in the plurality of streams of the hardware accelerator; and performing the operations represented by the nodes in the subgraph in accordance with the assignment.
-
Specification