Method for operating an optimal weight pruning apparatus for designing artificial neural networks
First Claim
1. A method for operating a design system for designing a minimal connection neural network from a given trained neural network design by iteratively pruning, by removing synaptic weights, and by adjusting any remaining synaptic weights so that the resulting neural network design performance satisfies a prescribed error budget, the design system includinga processor control unit for overall control of the design system,arithmetic processing, and for providing external input/output data ports,a data memory for storage of neural network input/output data, and neural network design data,a synaptic weight pruning unit for producing a reduced connection neural network design from a given trained neural network design,a neural network modelling unit for modelling a neural network from a set of neural network design data that includes a topological network description, a set of synaptic weights, and activation function descriptions,the method for operating the design system comprising:
- (a) storing the given trained neural network design data that includes a topological network description, activation function descriptions, and synaptic weight values;
(b) storing a set of exemplar input pruning vectors and corresponding response vectors for use in the neural network pruning module;
(c) initializing the neural network modelling unit using the set of trained neural network design data;
(d) operating the neural network modelling unit using the set of exemplar input pruning vectors as input data and storing each response vector in data memory;
(e) initializing the neural network pruning unit with initializing data that includes the stored trained neural network design data together with the set of exemplar pruning response vectors and the corresponding response vectors from step (d);
(f) operating the synaptic weight pruning unit for producing an iterated set of pruned neural network design data, the operating step including(i) computing a Hessian matrix of the trained neural network using the initializing data from step (d),(ii) computing an inverse Hessian matrix of the Hessian matrix of step (f)(i),(iii) computing a saliency value of each synaptic weight using the inverse Hessian matrix and the stored trained synaptic weights,(iv) selecting a synaptic weight with the smallest salient value as a selected pruning candidate weight,(v) computing a total error value that would result from pruning the selected pruning candidate weight,(vi) comparing the total error value with a specified error budget value and proceeding to step (g) if the total error value is less, otherwise terminating the method because the given trained neural network design is the minimal connection neural network design;
(g) operating the synaptic weight pruning unit for pruning and post pruning synaptic weight correction by(i) pruning the candidate weight by removing the candidate weight from the given trained neural network design data,(ii) modifying the topological network description by eliminating the pruning candidate weight branch,(iii) computing a weight correction vector, with one vector element for each remaining weight of the given trained neural network design data, that minimizes the total error value caused by pruning the pruning candidate weight, and(iv) adjusting the synaptic weights by applying the weight correction vector elements to the corresponding synaptic weights; and
(h) performing another iteration by returning to step (c) and using the modified topological description and the adjusted synaptic weights of step (g) as the given trained neural network design data topological description and synaptic weights.
0 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for designing a multilayer feed forward neural network that produces a design having a minimum number of connecting weights is based on a novel iterative procedure for inverting the full Hessian matrix of the neural network. The inversion of the full Hessian matrix results in a practical strategy for pruning weights of a trained neural network. The error caused by pruning is minimized by a correction that is applied to remaining (un-pruned) weights thus reducing the need for retraining. However, retraining may be applied to the network possibly leading to further the simplification of the network design.
44 Citations
12 Claims
-
1. A method for operating a design system for designing a minimal connection neural network from a given trained neural network design by iteratively pruning, by removing synaptic weights, and by adjusting any remaining synaptic weights so that the resulting neural network design performance satisfies a prescribed error budget, the design system including
a processor control unit for overall control of the design system, arithmetic processing, and for providing external input/output data ports, a data memory for storage of neural network input/output data, and neural network design data, a synaptic weight pruning unit for producing a reduced connection neural network design from a given trained neural network design, a neural network modelling unit for modelling a neural network from a set of neural network design data that includes a topological network description, a set of synaptic weights, and activation function descriptions, the method for operating the design system comprising: -
(a) storing the given trained neural network design data that includes a topological network description, activation function descriptions, and synaptic weight values; (b) storing a set of exemplar input pruning vectors and corresponding response vectors for use in the neural network pruning module; (c) initializing the neural network modelling unit using the set of trained neural network design data; (d) operating the neural network modelling unit using the set of exemplar input pruning vectors as input data and storing each response vector in data memory; (e) initializing the neural network pruning unit with initializing data that includes the stored trained neural network design data together with the set of exemplar pruning response vectors and the corresponding response vectors from step (d); (f) operating the synaptic weight pruning unit for producing an iterated set of pruned neural network design data, the operating step including (i) computing a Hessian matrix of the trained neural network using the initializing data from step (d), (ii) computing an inverse Hessian matrix of the Hessian matrix of step (f)(i), (iii) computing a saliency value of each synaptic weight using the inverse Hessian matrix and the stored trained synaptic weights, (iv) selecting a synaptic weight with the smallest salient value as a selected pruning candidate weight, (v) computing a total error value that would result from pruning the selected pruning candidate weight, (vi) comparing the total error value with a specified error budget value and proceeding to step (g) if the total error value is less, otherwise terminating the method because the given trained neural network design is the minimal connection neural network design; (g) operating the synaptic weight pruning unit for pruning and post pruning synaptic weight correction by (i) pruning the candidate weight by removing the candidate weight from the given trained neural network design data, (ii) modifying the topological network description by eliminating the pruning candidate weight branch, (iii) computing a weight correction vector, with one vector element for each remaining weight of the given trained neural network design data, that minimizes the total error value caused by pruning the pruning candidate weight, and (iv) adjusting the synaptic weights by applying the weight correction vector elements to the corresponding synaptic weights; and (h) performing another iteration by returning to step (c) and using the modified topological description and the adjusted synaptic weights of step (g) as the given trained neural network design data topological description and synaptic weights. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method for operating a design system for designing a minimal connection neural network from a given untrained neural network design by training the untrained network using a set of exemplar input training vectors and a set of exemplar response vectors, then operating on the resulting trained neural network design by iteratively pruning, by removing synaptic weights, and by adjusting any remaining synaptic weights so that the resulting neural network design performance satisfies a prescribed error budget, the design system including
a control unit for overall control of the design system and for providing external input/output data ports, a data memory for storage of neural network input/output data, and neural network design data, a neural network training unit for training of an untrained neural network and for producing a trained neural network design by using a set of exemplar training input vectors and corresponding exemplar response vectors, a synaptic weight pruning unit for producing a reduced connection neural cell design from a given trained neural network design, a neural network modelling unit for modelling a neural network from a set of neural network design data that includes a topological network description, a set of synaptic weights, and activation function descriptions, the method for operating the design system comprising: -
(a) storing the untrained neural network design that includes a topological network description, activation function descriptions, and synaptic weight values, (b) storing a set of exemplar input and output training vectors; (c) initializing the neural network modelling unit with a set of untrained neural network design data that includes a topological network description, a set of synaptic weights, and activation function descriptions, (d) operating the neural network training unit for controlling the neural network modelling module for generating a response to a set of exemplar input training vectors, comparing each response vector to a corresponding exemplar response vector, and adjusting the untrained neural network set of synaptic weights in accordance with a known training procedure, for generating a description of a trained neural network design data, (e) storing the trained neural network design data that includes a topological network description, activation function descriptions, and synaptic weight values; (f) storing of a set of exemplar input pruning vectors and corresponding response vectors for use in the neural network pruning unit; (g) initializing the neural network modelling unit using the trained neural network design data; (h) operating the neural network modelling unit using the set of exemplar input pruning vectors as input data and storing each response vector in data memory; (j) initializing the neural network pruning unit using the stored trained neural network design data together with the set of exemplar pruning response vectors and the corresponding response vectors from step (h); (k) operating the neural network pruning unit for producing an iterated set of pruned neural network design data, the operating step including (i) computing a Hessian matrix of the trained neural network using the data from step (h), (ii) computing an inverse Hessian matrix of the Hessian matrix of step (h)(i), (iii) computing a saliency value of each synaptic weight using the inverse Hessian matrix and the stored trained synaptic weights, (iv) selecting a synaptic weight with the smallest salient value as a selected pruning candidate weight, (v) computing a total error value that would result from pruning the selected pruning candidate weight, (vi) comparing the total error value with a specified error budget value and proceeding to step (I) if the total error value is less, otherwise terminating the method; (l) operating the synaptic weight pruning module for pruning and post pruning synaptic weight correction by (i) pruning the candidate weight by removing the candidate weight from the trained neural network design data, (ii) modifying the topological network description by eliminating the pruning candidate weight branch, (iii) computing a weight correction vector, with one vector element for each remaining weight of the trained neural network design data, that minimizes the total error value caused by pruning the pruning candidate weight, and (iv) adjusting the synaptic weights by applying the weight correction vector elements to the corresponding synaptic weights; and (m) performing another iteration by returning to step (g) and using the modified topological description and the adjusted synaptic weights of step (I) as the trained neural network design data topological description and synaptic weights. - View Dependent Claims (11, 12)
-
Specification