HARDWARE-EFFICIENT DEEP CONVOLUTIONAL NEURAL NETWORKS
First Claim
1. A convolutional neural network system, comprising:
- one or more processors;
a memory configured to store a sparse, frequency domain representation of a convolutional weighting kernel;
a time-domain-to-frequency-domain converter configured to, by the one or more processors, generate a frequency domain representation of an input image;
a feature extractor configured to, by the one or more processors;
access the memory, andextract a plurality of features based at least in part on the sparse, frequency domain representation of the convolutional weighting kernel and the frequency domain representation of the input image; and
a classifier configured to, by the one or more processors, determine, based on the plurality of extracted features, whether the input image contains an object of interest.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems, methods, and computer media for implementing convolutional neural networks efficiently in hardware are disclosed herein. A memory is configured to store a sparse, frequency domain representation of a convolutional weighting kernel. A time-domain-to-frequency-domain converter is configured to generate a frequency domain representation of an input image. A feature extractor is configured to access the memory and, by a processor, extract features based on the sparse, frequency domain representation of the convolutional weighting kernel and the frequency domain representation of the input image. The feature extractor includes convolutional layers and fully connected layers. A classifier is configured to determine, based on extracted features, whether the input image contains an object of interest. Various types of memory can be used to store different information, allowing information-dense data to be stored in faster (e.g., faster access time) memory and sparse data to be stored in slower memory.
-
Citations
20 Claims
-
1. A convolutional neural network system, comprising:
-
one or more processors; a memory configured to store a sparse, frequency domain representation of a convolutional weighting kernel; a time-domain-to-frequency-domain converter configured to, by the one or more processors, generate a frequency domain representation of an input image; a feature extractor configured to, by the one or more processors; access the memory, and extract a plurality of features based at least in part on the sparse, frequency domain representation of the convolutional weighting kernel and the frequency domain representation of the input image; and a classifier configured to, by the one or more processors, determine, based on the plurality of extracted features, whether the input image contains an object of interest. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A method, comprising:
-
receiving an input image; generating a frequency domain representation of the input image; in a convolutional neural network, extracting a plurality of features based at least in part on the frequency domain representation of the input image and a sparse, frequency domain representation of a convolutional weighting kernel, wherein the sparse, frequency domain representation of the convolutional weighting kernel comprises a dense matrix and one or more sparse matrices; classifying the input image based on the plurality of extracted features; and based on the classifying, identifying the input image as containing an object of interest. - View Dependent Claims (13, 14, 15, 16, 17, 19, 20)
-
-
18. One or more computer-readable storage media storing computer-executable instructions for recognizing images, the recognizing comprising:
-
receiving an input image; generating a frequency domain representation of the input image; determining a sparse, frequency domain representation of a convolutional weighting kernel, the sparse, frequency domain representation comprising one or more sparse matrices and a dense matrix; in a plurality of convolutional layers of a deep convolutional neural network, processing the input image based on the frequency domain representation of the input image, the one or more sparse matrices, and a frequency domain nonlinear function; in a plurality of fully connected layers of the deep convolutional neural network, processing the input image based on an output of the plurality of convolutional layers; determining a plurality of extracted features based on an output of the plurality of fully connected layers; classifying the input image based on the plurality of extracted features; and based on the classification, identifying the input image as containing an object of interest.
-
Specification