HARDWARE-EFFICIENT DEEP CONVOLUTIONAL NEURAL NETWORKS

US 20170132496A1
Filed: 11/05/2015
Published: 05/11/2017
Est. Priority Date: 11/05/2015
Status: Active Grant

First Claim

Patent Images

1. A convolutional neural network system, comprising:

one or more processors;

a memory configured to store a sparse, frequency domain representation of a convolutional weighting kernel;

a time-domain-to-frequency-domain converter configured to, by the one or more processors, generate a frequency domain representation of an input image;

a feature extractor configured to, by the one or more processors;

access the memory, andextract a plurality of features based at least in part on the sparse, frequency domain representation of the convolutional weighting kernel and the frequency domain representation of the input image; and

a classifier configured to, by the one or more processors, determine, based on the plurality of extracted features, whether the input image contains an object of interest.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems, methods, and computer media for implementing convolutional neural networks efficiently in hardware are disclosed herein. A memory is configured to store a sparse, frequency domain representation of a convolutional weighting kernel. A time-domain-to-frequency-domain converter is configured to generate a frequency domain representation of an input image. A feature extractor is configured to access the memory and, by a processor, extract features based on the sparse, frequency domain representation of the convolutional weighting kernel and the frequency domain representation of the input image. The feature extractor includes convolutional layers and fully connected layers. A classifier is configured to determine, based on extracted features, whether the input image contains an object of interest. Various types of memory can be used to store different information, allowing information-dense data to be stored in faster (e.g., faster access time) memory and sparse data to be stored in slower memory.

Citations

20 Claims

1. A convolutional neural network system, comprising:
- one or more processors;
  
  a memory configured to store a sparse, frequency domain representation of a convolutional weighting kernel;
  
  a time-domain-to-frequency-domain converter configured to, by the one or more processors, generate a frequency domain representation of an input image;
  
  a feature extractor configured to, by the one or more processors;
  
  access the memory, andextract a plurality of features based at least in part on the sparse, frequency domain representation of the convolutional weighting kernel and the frequency domain representation of the input image; and
  
  a classifier configured to, by the one or more processors, determine, based on the plurality of extracted features, whether the input image contains an object of interest.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The system of claim 1, wherein the feature extractor comprises a plurality of convolutional layers and a plurality of fully connected layers.
  - 3. The system of claim 2, wherein the memory is a first memory of a first memory type, and further comprising a second memory configured to store coefficients for the plurality of fully connected layers, wherein the second memory is of a second memory type, and wherein the first memory type has a slower access time or lower energy consumption than an access time or energy consumption of the second memory type.
  - 4. The system of claim 3, wherein the first memory type is DRAM, and wherein the second memory type is SRAM.
  - 5. The system of claim 3, further comprising a third memory configured to store input image coefficients, wherein the third memory is of a third memory type and has an access time or energy consumption between the access time or energy consumption of the first memory type and the access time or energy consumption of the second memory type.
  - 6. The system of claim 2, wherein the sparse, frequency domain representation of the convolutional weighting kernel comprises a dense matrix and one or more sparse matrices, and wherein a first convolutional layer of the plurality of convolutional layers is configured to:
    - multiply the frequency domain representation of the input image by the one or more sparse matrices and apply a nonlinear function to a result of the multiplication.
  - 7. The system of claim 6, wherein the nonlinear function is a frequency domain function.
  - 8. The system of claim 6, wherein a second convolutional layer of the plurality of convolutional layers is configured to:
    - multiply a frequency domain output of the first convolutional layer by the one or more sparse matrices and apply a nonlinear function to a result of the multiplication.
  - 9. The system of claim 2, wherein the sparse, frequency domain representation of the convolutional weighting kernel comprises a dense matrix and one or more sparse matrices, and wherein, and wherein prior to generation, by the feature extractor, of a feature vector of the plurality of extracted features, an output of a last convolutional layer is multiplied by the dense matrix.
  - 10. The system of claim 1, further comprising a camera configured to capture video, and wherein the input image is a video frame captured by the camera.
  - 11. The system of claim 10, wherein the system is part of a virtual reality or augmented reality system.

12. A method, comprising:
- receiving an input image;
  
  generating a frequency domain representation of the input image;
  
  in a convolutional neural network, extracting a plurality of features based at least in part on the frequency domain representation of the input image and a sparse, frequency domain representation of a convolutional weighting kernel, wherein the sparse, frequency domain representation of the convolutional weighting kernel comprises a dense matrix and one or more sparse matrices;
  
  classifying the input image based on the plurality of extracted features; and
  
  based on the classifying, identifying the input image as containing an object of interest.
- View Dependent Claims (13, 14, 15, 16, 17, 19, 20)
- - 13. The method of claim 12, wherein extracting the plurality of features comprises:
    - performing convolutional processing in a convolutional portion of the convolutional neural network; and
      
      based on an output of the convolutional processing, performing fully connected processing in a fully connected portion of the convolutional neural network, wherein an output of the fully connected processing comprises the extracted features.
  - 14. The method of claim 12, wherein the convolutional neural network is a deep convolutional neural network comprising a plurality of convolutional layers and at least one fully connected layer.
  - 15. The method of claim 14, wherein extracting the plurality of features comprises, in a first convolutional layer of the plurality of convolutional layers, multiplying the frequency domain representation of the input image by the one or more sparse matrices and applying a nonlinear function to a result of the multiplying.
  - 16. The method of claim 14, wherein values for the convolutional weighting kernel are determined through training, wherein the one or more sparse matrices are stored in a first memory of a first memory type, wherein the dense matrix is stored in a second memory of a second memory type, and wherein the first memory type has a slower access time than the second memory type.
  - 17. The method of claim 16, wherein the first memory type has lower energy consumption than the second memory type.
  - 19. The one or more computer-readable storage media of claim 17, wherein prior to determining the plurality of extracted features, an output of a last convolutional layer is multiplied by the dense matrix.
  - 20. The one or more computer-readable storage media of claim 17, wherein the one or more sparse matrices are stored in a first memory of a first memory type, wherein the dense matrix is stored in a second memory of a second memory type, and wherein the first memory type has a slower access time than the second memory type.

18. One or more computer-readable storage media storing computer-executable instructions for recognizing images, the recognizing comprising:
- receiving an input image;
  
  generating a frequency domain representation of the input image;
  
  determining a sparse, frequency domain representation of a convolutional weighting kernel, the sparse, frequency domain representation comprising one or more sparse matrices and a dense matrix;
  
  in a plurality of convolutional layers of a deep convolutional neural network, processing the input image based on the frequency domain representation of the input image, the one or more sparse matrices, and a frequency domain nonlinear function;
  
  in a plurality of fully connected layers of the deep convolutional neural network, processing the input image based on an output of the plurality of convolutional layers;
  
  determining a plurality of extracted features based on an output of the plurality of fully connected layers;
  
  classifying the input image based on the plurality of extracted features; and
  
  based on the classification, identifying the input image as containing an object of interest.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Inventors
Shoaib, Mohammed, Liu, Jie

Granted Patent

US 9,904,874 B2
Time in Patent Office

Days
Field of Search
US Class Current
CPC Class Codes

A01M 1/026   combined with devices for m...

G06F 18/241   relating to the classificat...

G06N 20/10   using kernel methods, e.g. ...

G06N 3/0464   Convolutional networks [CNN...

G06N 3/048   Activation functions

G06N 3/063   using electronic means

HARDWARE-EFFICIENT DEEP CONVOLUTIONAL NEURAL NETWORKS

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

HARDWARE-EFFICIENT DEEP CONVOLUTIONAL NEURAL NETWORKS

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links