Method of implementing a neural network on a digital computer
First Claim
1. A method for implementing a neural network on a digital computer, wherein the neural network comprises a plurality of nodes organized into at least two ordered layers, with each node not on the lowest-ordered layer performing a calculation modulating an output by a weighting value from each of a subset of nodes from the immediately lower-ordered layer to which said node is connected, the result of said calculation being referred to as an output value;
- andthe digital computer comprises several substantially identical parallel processing elements each coupled to a global memory, whereby the global memory is located on a memory board separate from a processing board containing the processing elements.said method comprising the steps of;
A. broadcasting from the global memory into a first local memory block of each of the processing elements first layer output values from a first layer of the neural network.B. causing said processing elements to calculate second layer output values for a set of nodes from a second layer immediately higher-ordered than said first layer, based on values of weights (stored by said processing elements) associated with connectivities between said nodes on said second layer and nodes on said first layer;
C. broadcasting, from each processing element, said second layer output values to the global memory;
D. each processing element substantially simultaneously monitoring the second output values of broadcasting Step C and storing said values in a local memory block;
whereinE. for every layer j except for the first layer, output values from the (j-l)st layer are broadcast from the global memory to a first local memory block of each of the processing elements;
F. the processing elements calculate output values for the jth layer based on stored weights associated wth connectivities between nodes of the jth layer and nodes of the (j-l)st layer and further based on the output values stored in the first local memory block;
G. substantially simultaneously with calculating Step F, additional output values from the (j-l)st layer are broadcast from the global memory into a second local memory block of each of the processing elements;
H. the processing elements then calculate additional output values for the jth layer, based on stored weights associated with connectivities between nodes of the jth layer and nodes of the (j-l)st layer and further based on the output values stored in the second local memory block;
I. each processing element broadcasts, to the global memory and to all the other processing elements, output values of nodes for the jth layer for which said processing element has performed calculations; and
J. each processing element substantially simultaneously monitors the output values of Step I and stores the output values in a local memory block.
2 Assignments
0 Petitions
Accused Products
Abstract
A digital computer architecture specifically tailored for implementing a neural network. Several simultaneously operable processors (10) each have their own local memory (17) for storing weight and connectivity information corresponding to nodes of the neural network whose output values will be calculated by said processor (10). A global memory (55,56) is coupled to each of the processors (10) via a common data bus (30). Output values corresponding to a first layer of the neural network are broadcast from the global memory (55,56) into each of the processors (10). The processors (10) calculate output values for a set of nodes of the next higher-ordered layer of the neural network. Said newly-calculated output values are broadcast from each processor (10) to the global memory (55,56) and to all the other processors (10), which use the output values as a head start in calculating a new set of output values corresponding to the next layer of the neural network.
61 Citations
2 Claims
-
1. A method for implementing a neural network on a digital computer, wherein the neural network comprises a plurality of nodes organized into at least two ordered layers, with each node not on the lowest-ordered layer performing a calculation modulating an output by a weighting value from each of a subset of nodes from the immediately lower-ordered layer to which said node is connected, the result of said calculation being referred to as an output value;
- and
the digital computer comprises several substantially identical parallel processing elements each coupled to a global memory, whereby the global memory is located on a memory board separate from a processing board containing the processing elements. said method comprising the steps of; A. broadcasting from the global memory into a first local memory block of each of the processing elements first layer output values from a first layer of the neural network. B. causing said processing elements to calculate second layer output values for a set of nodes from a second layer immediately higher-ordered than said first layer, based on values of weights (stored by said processing elements) associated with connectivities between said nodes on said second layer and nodes on said first layer; C. broadcasting, from each processing element, said second layer output values to the global memory; D. each processing element substantially simultaneously monitoring the second output values of broadcasting Step C and storing said values in a local memory block;
whereinE. for every layer j except for the first layer, output values from the (j-l)st layer are broadcast from the global memory to a first local memory block of each of the processing elements; F. the processing elements calculate output values for the jth layer based on stored weights associated wth connectivities between nodes of the jth layer and nodes of the (j-l)st layer and further based on the output values stored in the first local memory block; G. substantially simultaneously with calculating Step F, additional output values from the (j-l)st layer are broadcast from the global memory into a second local memory block of each of the processing elements; H. the processing elements then calculate additional output values for the jth layer, based on stored weights associated with connectivities between nodes of the jth layer and nodes of the (j-l)st layer and further based on the output values stored in the second local memory block; I. each processing element broadcasts, to the global memory and to all the other processing elements, output values of nodes for the jth layer for which said processing element has performed calculations; and J. each processing element substantially simultaneously monitors the output values of Step I and stores the output values in a local memory block.
- and
-
2. A method for implementing a neural network on a digital computer, wherein the neural network comprises a plurality of nodes organized into at least two ordered layers, with each node not on the lowest-ordered layer performing a calculation modulating an output by a weighting value from each of a subset of nodes from the immediately lower-ordered layer to which said node is connected, the result of said calculation being referred to as an output value;
- and
the digital computer comprises several substantially identical parallel processing elements each coupled to a global memory, whereby the global memory is located on a memory board separate from a processing board containing the processing elements; said method comprising the steps of; A. broadcasting from the global memory into a first local memory block of each of the processing elements first layer output values from a first layer of the neural network; B. causing said processing elements to calculate second layer output values for a set of nodes from a second layer immediately higher-ordered than said first layer, based on values of weights (stored by said processing elements) associated with connectivities between said nodes on said second layer and nodes on said first layer; C. broadcasting, from each processing element, said second layer output values to the global memory; D. each processing element substantially simultaneously monitoring the second output values of broadcasting Step C and storing said values in a local memory block; wherein the output values that are broadcast from the global memory to the processing elements are sufficiently voluminous that said output values cannot be processed by the processing elements in one step; in a first substep of the first broadcasting step, a first portion of said output values is fed into a first working memory within each processing element; and in a second substep of the first broadcasting step, additional portions of said output values are fed into a second working memory within each processing element, while the processing element simultaneously performs calculations on said first portion of said output values.
- and
Specification