×

Approximating fully-connected layers with multiple arrays of 3x3 convolutional filter kernels in a CNN based integrated circuit

  • US 10,366,328 B2
  • Filed: 03/14/2018
  • Issued: 07/30/2019
  • Est. Priority Date: 09/19/2017
  • Status: Active Grant
First Claim
Patent Images

1. A digital integrated circuit comprising:

  • a plurality of cellular neural networks (CNN) processing engines operatively coupled to at least one input/output data bus, the plurality of CNN processing engines being connected in a loop with a clock-skew circuit, each CNN processing engine comprising;

    a CNN processing block configured for simultaneously performing convolutional operations using input data and pre-trained filter coefficients of a plurality of ordered convolutional layers, and further configured for classifying the input data using a plurality of 3×

    3 filter kernels to approximate operations of fully-connected (FC) layers, wherein output of the plurality of ordered convolutional layers has P feature maps with F×

    F pixels of data per feature map and the plurality of 3×

    3 filter kernels comprises L layers with each of the L layers organized in an array of R×

    Q of 3×

    3 filter kernels, wherein Q and R are respective numbers of input and output feature maps of a particular layer of the L layers, wherein L is equal to (F−

    1)/2 when F is an odd number, and wherein P, F, Q and R are positive integers;

    a first set of memory buffers operatively coupling to the CNN processing block for storing the input data; and

    a second set of memory buffers operative coupling to the CNN processing block for storing the pre-trained filter coefficients.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×