×

Batch normalization layers

  • US 10,417,562 B2
  • Filed: 01/28/2016
  • Issued: 09/17/2019
  • Est. Priority Date: 01/28/2015
  • Status: Active Grant
First Claim
Patent Images

1. A neural network system implemented by one or more computers, the neural network system comprising:

  • instructions for implementing a batch normalization layer between a first neural network layer and a second neural network layer in a neural network, wherein the first neural network layer generates first layer outputs having a plurality of components, and wherein the instructions cause the one or more computers to perform operations comprising;

    during training of the neural network on a plurality of batches of training data, each batch comprising a respective plurality of training examples and for each of the batches;

    receiving a respective first layer output for each of the plurality of training examples in the batch;

    computing a plurality of normalization statistics for the batch from the first layer outputs, comprising;

    determining, for each of a plurality of subsets of the plurality of the components of the first layer outputs, a mean of the components of the first layer outputs for each of the plurality of training examples in the batch that are in the respective subset, anddetermining, for each of the plurality of subsets of the plurality of the components of the first layer outputs, a standard deviation of the components of the first layer outputs for each of the plurality of training examples in the batch that are in the respective subset;

    normalizing each of the plurality of the components of each first layer output using the normalization statistics to generate a respective normalized layer output for each training example in the batch, comprising;

    for each first layer output and for each of the plurality of subsets, normalizing the components of the first layer output that are in the respective subset using the mean for the respective subset and the standard deviation for the respective subset;

    generating a respective batch normalization layer output for each of the training examples from the normalized layer outputs; and

    providing the batch normalization layer output as an input to the second neural network layer.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×