ROTATING DATA FOR NEURAL NETWORK COMPUTATIONS
First Claim
1. A method for computing a layer output for a convolutional neural network layer from a layer input for the convolutional neural network layer using a two-dimensional systolic array, the convolutional neural network layer having a plurality of kernels, each kernel having a respective matrix structure of weights, the method comprising:
- receiving a plurality of activation inputs, the plurality of activation inputs represented as a multi-dimensional matrix;
forming a plurality of vector inputs from the plurality of activation inputs, each vector input comprising values from a distinct region within the multi-dimensional matrix;
sending the plurality of vector inputs to one or more cells along a first dimension of the systolic array;
generating a plurality of rotated kernel structures from each of the plurality of kernels, where generating a particular rotated kernel structure comprises shifting elements in the respective matrix structure for the kernel along one dimension;
sending each kernel structure and each rotated kernel structure to one or more cells along a second dimension of the systolic array;
causing the systolic array to generate an accumulated output based on the plurality of value inputs and the plurality of kernels; and
generating the layer output from the accumulated output.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for computing a layer output for a convolutional neural network layer, the method comprising: receiving a plurality of activation inputs; forming a plurality of vector inputs from the plurality of activation inputs, each vector input comprising values from a distinct region within the multi-dimensional matrix; sending the plurality of vector inputs to one or more cells along a first dimension of the systolic array; generating a plurality of rotated kernel structures from each of the plurality of kernel; sending each kernel structure and each rotated kernel structure to one or more cells along a second dimension of the systolic array; causing the systolic array to generate an accumulated output based on the plurality of value inputs and the plurality of kernels; and generating the layer output from the accumulated output.
100 Citations
21 Claims
-
1. A method for computing a layer output for a convolutional neural network layer from a layer input for the convolutional neural network layer using a two-dimensional systolic array, the convolutional neural network layer having a plurality of kernels, each kernel having a respective matrix structure of weights, the method comprising:
-
receiving a plurality of activation inputs, the plurality of activation inputs represented as a multi-dimensional matrix; forming a plurality of vector inputs from the plurality of activation inputs, each vector input comprising values from a distinct region within the multi-dimensional matrix; sending the plurality of vector inputs to one or more cells along a first dimension of the systolic array; generating a plurality of rotated kernel structures from each of the plurality of kernels, where generating a particular rotated kernel structure comprises shifting elements in the respective matrix structure for the kernel along one dimension; sending each kernel structure and each rotated kernel structure to one or more cells along a second dimension of the systolic array; causing the systolic array to generate an accumulated output based on the plurality of value inputs and the plurality of kernels; and generating the layer output from the accumulated output. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system for computing a layer output for a convolutional neural network layer from a layer input for the convolutional neural network layer using a two-dimensional systolic array, the convolutional neural network layer having a plurality of kernels, each kernel having a respective matrix structure of weights, the system comprising:
-
one or more computers; and computer-readable medium coupled to the one or more computers and having instructions stored thereon, which, when executed by the one or more computers, cause the one or more computers to perform operations comprising; receiving a plurality of activation inputs, the plurality of activation inputs represented as a multi-dimensional matrix; forming a plurality of vector inputs from the plurality of activation inputs, each vector input comprising values from a distinct region within the multi-dimensional matrix; sending the plurality of vector inputs to one or more cells along a first dimension of the systolic array; generating a plurality of rotated kernel structures from each of the plurality of kernels, where generating a particular rotated kernel structure comprises shifting elements in the respective matrix structure for the kernel along one dimension; sending each kernel structure and each rotated kernel structure to one or more cells along a second dimension of the systolic array; causing the systolic array to generate an accumulated output based on the plurality of value inputs and the plurality of kernels; and generating the layer output from the accumulated output. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A computer-readable medium having instructions stored thereon, which, when executed by one or more computers, cause the one or more computers to perform operations for computing a layer output for a convolutional neural network layer from a layer input for the convolutional neural network layer using a two-dimensional systolic array, the convolutional neural network layer having a plurality of kernels, each kernel having a respective matrix structure of weights, the operations comprising:
-
receiving a plurality of activation inputs, the plurality of activation inputs represented as a multi-dimensional matrix; forming a plurality of vector inputs from the plurality of activation inputs, each vector input comprising values from a distinct region within the multi-dimensional matrix; sending the plurality of vector inputs to one or more cells along a first dimension of the systolic array; generating a plurality of rotated kernel structures from each of the plurality of kernels, where generating a particular rotated kernel structure comprises shifting elements in the respective matrix structure for the kernel along one dimension; sending each kernel structure and each rotated kernel structure to one or more cells along a second dimension of the systolic array; causing the systolic array to generate an accumulated output based on the plurality of value inputs and the plurality of kernels; and generating the layer output from the accumulated output. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
Specification