×

Exploiting sparseness in training deep neural networks

  • US 8,700,552 B2
  • Filed: 11/28/2011
  • Issued: 04/15/2014
  • Est. Priority Date: 11/28/2011
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer-implemented process for training a deep neural network (DNN), comprising:

  • using a computer to perform the following process actions;

    (a) initially training a fully interconnected DNN comprising an input layer into which training data is input, an output layer from which an output is generated, and a plurality of hidden layers, wherein said training comprises,(i) accessing a set of training data entries,(ii) inputting each data entry of said set one by one into the input layer until all the data entries have been input once to produce an interimly trained DNN, such that after the inputting of each data entry, a value of each weight associated with each interconnection of each hidden layer are set via an error back-propagation procedure so that the output from the output layer matches a label assigned to the training data entry,(iii) repeating actions (i) and (ii) a number of times to establish an initially trained DNN;

    (b) identifying each interconnection associated with each layer of the initially trained DNN whose interconnection weight value does not exceed a first weight threshold;

    (c) setting the value of each of identified interconnection to zero;

    (d) inputting each data entry of said set one by one into the input layer until all the data entries have been input once to produce a current refined DNN, such that after the inputting of each data entry, the values of the weights associated with the interconnections of each hidden layer are set via an error back-propagation procedure so that the output from the output layer matches the label assigned to the training data entry;

    (e) identifying those interconnections associated with each hidden layer of the last produced refined DNN whose interconnection weight value does not exceed a second weight threshold;

    (f) setting the value of each of the identified interconnections whose interconnection weight value does not exceed the second weight threshold to zero; and

    (g) repeating actions (d) through (f) a number of times to produce said trained DNN.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×