Implementation of ResNet in a CNN based digital integrated circuit
First Claim
1. A digital integrated circuit for feature extraction comprising:
- a plurality of cellular neural networks (CNN) processing engines operatively coupled to at least one input/output data bus, the plurality of CNN processing engines being connected in a loop with a clock-skew circuit, each CNN processing engine comprising;
a CNN processing block configured for simultaneously obtaining convolution operations results using input data and pre-trained filter coefficients of a plurality of convolutional layers including at least one set of three particular convolutional layers for performing equivalent operations of a combination of first and second original convolutional layers followed by a short path, the equivalent operations containing convolutional operations of the first and the second original convolutional layers followed by element-wise add operations with an input that contains N feature maps and an output also contains N feature maps, each of the first and the second original convolutional layers contains N×
N of 3×
3 filter kernels, where N is a positive integer;
a first set of memory buffers operatively coupling to the CNN processing block for storing the input data; and
a second set of memory buffers operative coupling to the CNN processing block for storing the pre-trained filter coefficients;
wherein first of the three particular convolutional layers contains 2N×
N of 3×
3 filter kernels formed by placing said N×
N of 3×
3 filter kernels of the first original convolutional layer in left side and N×
N of 3×
3 filter kernels of an identity-value convolutional layer in right side.
1 Assignment
0 Petitions
Accused Products
Abstract
Operations of a combination of first and second original convolutional layers followed by a short path are replaced by operations of a set of three particular convolutional layers. The first contains 2N×N filter kernels formed by placing said N×N filter kernels of the first original convolutional layer in left side and N×N filter kernels of an identity-value convolutional layer in right side. The second contains 2N×2N filter kernels formed by placing the N×N filter kernels of the second original convolutional layer in upper left corner, N×N filter kernels of an identity-value convolutional layer in lower right corner, and N×N filter kernels of two zero-value convolutional layers in either off-diagonal corner. The third contains N×2N of kernels formed by placing N×N filter kernels of a first identity-value convolutional layer and N×N filter kernels of a second identity-value convolutional layer in a vertical stack. Each filter kernel contains 3×3 filter coefficients.
-
Citations
16 Claims
-
1. A digital integrated circuit for feature extraction comprising:
- a plurality of cellular neural networks (CNN) processing engines operatively coupled to at least one input/output data bus, the plurality of CNN processing engines being connected in a loop with a clock-skew circuit, each CNN processing engine comprising;
a CNN processing block configured for simultaneously obtaining convolution operations results using input data and pre-trained filter coefficients of a plurality of convolutional layers including at least one set of three particular convolutional layers for performing equivalent operations of a combination of first and second original convolutional layers followed by a short path, the equivalent operations containing convolutional operations of the first and the second original convolutional layers followed by element-wise add operations with an input that contains N feature maps and an output also contains N feature maps, each of the first and the second original convolutional layers contains N×
N of 3×
3 filter kernels, where N is a positive integer;
a first set of memory buffers operatively coupling to the CNN processing block for storing the input data; and
a second set of memory buffers operative coupling to the CNN processing block for storing the pre-trained filter coefficients;
wherein first of the three particular convolutional layers contains 2N×
N of 3×
3 filter kernels formed by placing said N×
N of 3×
3 filter kernels of the first original convolutional layer in left side and N×
N of 3×
3 filter kernels of an identity-value convolutional layer in right side. - View Dependent Claims (2, 11, 12, 13)
- a plurality of cellular neural networks (CNN) processing engines operatively coupled to at least one input/output data bus, the plurality of CNN processing engines being connected in a loop with a clock-skew circuit, each CNN processing engine comprising;
-
3. A digital integrated circuit for feature extraction comprising:
- a plurality of cellular neural networks (CNN) processing engines operatively coupled to at least one input/output data bus, the plurality of CNN processing engines being connected in a loop with a clock-skew circuit, each CNN processing engine comprising;
a CNN processing block configured for simultaneously obtaining convolution operations results using input data and pre-trained filter coefficients of a plurality of convolutional layers including at least one set of three particular convolutional layers for performing equivalent operations of a combination of first and second original convolutional layers followed by a short path, the equivalent operations containing convolutional operations of the first and the second original convolutional layers followed by element-wise add operations with an input that contains N feature maps and an output also contains N feature maps, each of the first and the second original convolutional layers contains N×
N of 3×
3 filter kernels, where N is a positive integer;
a first set of memory buffers operatively coupling to the CNN processing block for storing the input data; and
a second set of memory buffers operative coupling to the CNN processing block for storing the pre-trained filter coefficients;
wherein second of the three particular convolutional layers contains 2N×
2N of 3×
3 filter kernels formed by placing said N×
N of 3×
3 filter kernels of the second original convolutional layer in upper left corner, N×
N of 3×
3 filter kernels of an identity-value convolutional layer in lower right corner, and N×
N of 3×
3 filter kernels of two zero-value convolutional layers in either off diagonal corner. - View Dependent Claims (4, 5, 8, 9, 10, 14, 15, 16)
- a plurality of cellular neural networks (CNN) processing engines operatively coupled to at least one input/output data bus, the plurality of CNN processing engines being connected in a loop with a clock-skew circuit, each CNN processing engine comprising;
-
6. A digital integrated circuit for feature extraction comprising:
- a plurality of cellular neural networks (CNN) processing engines operatively coupled to at least one input/output data bus, the plurality of CNN processing engines being connected in a loop with a clock-skew circuit, each CNN processing engine comprising;
a CNN processing block configured for simultaneously obtaining convolution operations results using input data and pre-trained filter coefficients of a plurality of convolutional layers including at least one set of three particular convolutional layers for performing equivalent operations of a combination of first and second original convolutional layers followed by a short path, the equivalent operations containing convolutional operations of the first and the second original convolutional layers followed by element-wise add operations with an input that contains N feature maps and an output also contains N feature maps, each of the first and the second original convolutional layers contains N×
N of 3×
3 filter kernels, where N is a positive integer;
a first set of memory buffers operatively coupling to the CNN processing block for storing the input data; and
a second set of memory buffers operative coupling to the CNN processing block for storing the pre-trained filter coefficients;
wherein third of the three particular convolutional layers contains N×
2N of 3×
3 filter kernels formed by placing N×
N of 3×
3 filter kernels of a first identity value convolutional layer and N×
N of 3×
3 filter kernels of a second identity value convolutional layer in a vertical stack. - View Dependent Claims (7)
- a plurality of cellular neural networks (CNN) processing engines operatively coupled to at least one input/output data bus, the plurality of CNN processing engines being connected in a loop with a clock-skew circuit, each CNN processing engine comprising;
Specification