Training convolutional neural networks on graphics processing units

US 7,747,070 B2
Filed: 08/31/2005
Issued: 06/29/2010
Est. Priority Date: 08/31/2005
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method for training a convolutional neural network to identify images using graphics data which can be read by a graphics processing unit (“

GPU”

), and one or more GPU-executable programs, the method comprising;

receiving the graphics data representing a state of the convolutional neural network and comprising one or more textures representing one or more neural network variables, wherein the one or more textures comprises a texture with two-dimensional addressing, and at least one or more of the textures represents a neural network variable with addressing of more than two dimensions which has been flattened into two dimensional addressing, the convolutional neural network comprising at least one layer comprising a plurality of patches;

executing one or more of the GPU-executable programs on the GPU in order to perform a forward pass in the convolutional neural network, the executing including performing convolution operations on the patches;

executing one or more of the GPU-executable programs on the GPU in order to perform a backward pass in the convolutional neural network, the executing including performing convolution operations on the patches;

executing one or more of the GPU-executable programs on the GPU in order to modify the patches in the convolutional neural network by changing the graphics data based on results of the backward pass; and

repeating executing one or more of the GPU-executable programs to perform forward passes, backward passes, and to modify the graphics data until the convolutional neural network is trained.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A convolutional neural network is implemented on a graphics processing unit. The network is then trained through a series of forward and backward passes, with convolutional kernels and bias matrices modified on each backward pass according to a gradient of an error function. The implementation takes advantage of parallel processing capabilities of pixel shader units on a GPU, and utilizes a set of start-to-finish formulas to program the computations on the pixel shaders. Input and output to the program is done through textures, and a multi-pass summation process is used when sums are needed across pixel shader unit registers.

Citations

18 Claims

1. A computer-implemented method for training a convolutional neural network to identify images using graphics data which can be read by a graphics processing unit (“
- GPU”
  
  ), and one or more GPU-executable programs, the method comprising;
  
  receiving the graphics data representing a state of the convolutional neural network and comprising one or more textures representing one or more neural network variables, wherein the one or more textures comprises a texture with two-dimensional addressing, and at least one or more of the textures represents a neural network variable with addressing of more than two dimensions which has been flattened into two dimensional addressing, the convolutional neural network comprising at least one layer comprising a plurality of patches;
  
  executing one or more of the GPU-executable programs on the GPU in order to perform a forward pass in the convolutional neural network, the executing including performing convolution operations on the patches;
  
  executing one or more of the GPU-executable programs on the GPU in order to perform a backward pass in the convolutional neural network, the executing including performing convolution operations on the patches;
  
  executing one or more of the GPU-executable programs on the GPU in order to modify the patches in the convolutional neural network by changing the graphics data based on results of the backward pass; and
  
  repeating executing one or more of the GPU-executable programs to perform forward passes, backward passes, and to modify the graphics data until the convolutional neural network is trained.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The method of claim 1, wherein addresses of values represented in the texture with two-dimensional addressing are determined through linear combinations of x and y offset coordinates from the upper-left corner of an output texture.
  - 3. The method of claim 1, wherein:
    - the GPU-executable programs are written in the form of one or more pixel shader programs.
  - 4. The method of claim 1, wherein the graphics data is adjusted using gradient descent.
  - 5. The method of claim 4, wherein:
    - the one or more program utilize formulas to compute partial derivatives to determine a gradient; and
      
      the formulas are combined and simplified algebraically in order to reduce pixel shader program invocations.
  - 6. The method of claim 1, wherein:
    - the neural network comprises one or more fully-connected layers; and
      
      the one or more GPU-executable programs comprise one or more GPU-executable programs specific to the one or more fully-connected layers which utilize separate formulas for fully-connected layers.
  - 7. The method of claim 1, wherein:
    - the neural network comprises one or more transitional layers; and
      
      the one or more GPU-executable programs comprise one or more GPU-executable programs specific to the one or more transitional layers which utilize separate formulas for transitional layers.
  - 8. The method of claim 1, wherein the graphics data describes a single triangle covering a viewport.
  - 9. The method of claim 1, wherein:
    - the one or more GPU-executable programs comprises one or more summations; and
      
      each of the one or more summations is broken up into multiple passes.
  - 10. The method of claim 1, wherein the convolutional neural network performs handwriting recognition.
  - 11. The method of claim 1, further comprising producing one or more computer-readable media containing data describing a convolutional network trained by the preceding process.

12. One or more computer-readable storage media storing instructions which, when executed on a graphics card, cause the graphics card to perform a method for training a convolutional neural network, the method comprising:
- receiving a plurality of textures, the plurality of textures at least in part representing square convolutional kernels for the neural network, wherein at least some of the textures have two-dimensional addressing, and represent square convolutional kernels with addressing of more than two dimensions which have been flattened into two dimensional addressing;
  
  computing a plurality of forward passes of the neural network on a plurality of input data including convoluting and subsampling the square convolutional kernels;
  
  for each of the plurality of forward passes, computing a backward pass of the neural network using a gradient function; and
  
  for each backward pass, based on the results of the gradient function, changing information contained in the square convolutional kernels from the plurality of textures to affect a training of the neural network.
- View Dependent Claims (13, 14)
- - 13. The computer-readable storage media of claim 12, wherein:
    - the neural network is being trained to recognize handwritten characters;
      
      the plurality of textures at least in part represents convolutional kernels; and
      
      the convolutional kernels operate on input data representing handwritten characters.
  - 14. The computer-readable storage media of claim 12, wherein the plurality of textures at least in part represents a fully-connected neural network level and a transitional level.

15. A graphics processing unit configured to perform a method for training a handwriting-recognition convolutional neural network, wherein the convolutional neural network comprises one or more of layers, at least some of the layers each comprising a plurality of square convolutional kernel patches, and wherein the graphics processing unit comprises:
- data storage, configured to store one or more graphics textures, the graphics textures describing the square convolutional kernel patches of the handwriting-recognition neural network, wherein at least some of the graphical textures have two-dimensional addressing and represent square convolutional kernel patches having addressing of more than two dimensions which have been flattened into two dimensional addressing;
  
  a plurality of pixel shader units configured via pixel shader programming;
  
  to perform repeated forward passes and backward passes of the neural network on handwriting input data, the passes including performing convolutional operations on the square convolutional kernel patches;
  
  to store results in the plurality of graphics textures; and
  
  to modify the square convolutional kernel patches of the plurality of textures based on results of the forward and backward passes in order to train the neural network.
- View Dependent Claims (16, 17, 18)
- - 16. The graphics processing unit of claim 15, wherein the handwriting-recognition neural network at least in part comprises one convolutional level and one fully-connected level.
  - 17. The graphics processing unit of claim 15, wherein the one or more graphics textures are configured to describe a simplified triangle image, such that all processing performed by the graphics processing unit only requires computation on the part of the pixel shader units.
  - 18. The graphics processing unit of claim 15, wherein the pixel shader units are configured such that summations in the forward passes and backward passes of the neural network are broken up into multiple smaller summations.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Puri, Siddhartha
Primary Examiner(s)
Ahmed; Samir A
Assistant Examiner(s)
Liu; Li

Application Number

US11/217,711
Publication Number

US 20070047802A1
Time in Patent Office

1,763 Days
Field of Search

382/157, 382/158, 706/12, 706/15
US Class Current

382/157
CPC Class Codes

G06N 3/045   Combinations of networks

G06N 3/06   Physical realisation, i.e. ...

G06N 3/063   using electronic means

G06N 3/084   Backpropagation, e.g. using...

G06V 30/10   Character recognition

G06V 30/18057   Integrating the filters int...

Training convolutional neural networks on graphics processing units

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Training convolutional neural networks on graphics processing units

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links