Method and apparatus for training a neural network using evolutionary programming
First Claim
1. A method of training a neural network to evaluate data, comprising the steps of:
- (a) configuring a neural network having a plurality of interconnected nodes including an input layer and an output layer, said neutral network being capable of receiving training patterns at said input layer and operative in a plurality of different weighted configurations, each defined by a different set of weight values;
(b) selecting a set of weight values and adjusting the network to operate in a weighted configuration defined by said set of weight values, and inputting each of a plurality of training patterns to said input layer to generate respective evaluations of the training patterns as outputs of the network at said output layer;
(c) comparing each evaluation of a respective training pattern to a desired output of the network to obtain a corresponding error;
(d) determining from all of the errors obtained in said step (c) an overall error value corresponding to the set of weight values;
(e) repeating said steps (b), (c) and (d) a plurality of times, each time with a different weighted configuration defined by a respective different set of weight values, to obtain a plurality of overall error values;
(f) for each of said sets of weight values, determining a score by selecting respective error comparison values from a predetermined variable probability distribution and comparing thereto the corresponding overall error value;
(g) selecting a predetermined number of the sets of weight values determined to have the best scores;
(h) generating copies of the sets of weight values selected in said step (g);
(i) for each of the copies, perturbing the weight values thereof generated in said step (h) by adding random numbers to the weight values to create a new set of weight values, the random numbers being obtained randomly from a continuous random distribution of number having a mean of zero and a variance which is a function of the overall error value determined for the set of weight values from which the copy was generated;
(j) incrementing a counter each time said steps (b) through (i) are performed, wherein said steps (b) through (e) are performed with at least the weighted configurations defined by the new sets of weight values created in the immediately preceding said step (i), and said steps (f) through (i) are performed with the sets of eight values selected in the immediately preceding said step (g) and with the new sets of weight values created in the immediately preceding said step (i) until the counter reaches a maximum count value;
(k) selecting, once the counter reaches the maximum count value, the set of weight values having a final best score as determined in step (g); and
(l) configuring the neural network to have the plurality of nodes interconnected in accordance with the set of weight values having the final best score.
3 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for training neural networks using evolutionary programming. A network is adjusted to operate in a weighted configuration defined by a set of weight values and a plurality of training patterns are input to the network to generate evaluations of the training patterns as network outputs. Each evaluation is compared to a desired output to obtain a corresponding error. From all of the errors, an overall error value corresponding to the set of weight values is determined. The above steps are repeated with different weighted configurations to obtain a plurality of overall error values. Then, for each set of weight values, a score is determined by selecting error comparison values from a predetermined variable probability distribution and comparing them to the corresponding overall error value. A predetermined number of the sets of weight values determined to have the best scores are selected and copies are made. The copies are mutated by adding random numbers to their weights and the above steps are repeated with the best sets and the mutated copies defining the weighted configurations. This procedure is repeated until the overall error values diminish to below an acceptable threshold. The random numbers added to the weight values of copies are obtained from a continuous random distribution of numbers having zero mean and variance determined such that it would be expected to converge to zero as the different sets of weight values in successive iterations converge toward sets of weight values yielding the desired neural network performance.
-
Citations
37 Claims
-
1. A method of training a neural network to evaluate data, comprising the steps of:
-
(a) configuring a neural network having a plurality of interconnected nodes including an input layer and an output layer, said neutral network being capable of receiving training patterns at said input layer and operative in a plurality of different weighted configurations, each defined by a different set of weight values; (b) selecting a set of weight values and adjusting the network to operate in a weighted configuration defined by said set of weight values, and inputting each of a plurality of training patterns to said input layer to generate respective evaluations of the training patterns as outputs of the network at said output layer; (c) comparing each evaluation of a respective training pattern to a desired output of the network to obtain a corresponding error; (d) determining from all of the errors obtained in said step (c) an overall error value corresponding to the set of weight values; (e) repeating said steps (b), (c) and (d) a plurality of times, each time with a different weighted configuration defined by a respective different set of weight values, to obtain a plurality of overall error values; (f) for each of said sets of weight values, determining a score by selecting respective error comparison values from a predetermined variable probability distribution and comparing thereto the corresponding overall error value; (g) selecting a predetermined number of the sets of weight values determined to have the best scores; (h) generating copies of the sets of weight values selected in said step (g); (i) for each of the copies, perturbing the weight values thereof generated in said step (h) by adding random numbers to the weight values to create a new set of weight values, the random numbers being obtained randomly from a continuous random distribution of number having a mean of zero and a variance which is a function of the overall error value determined for the set of weight values from which the copy was generated; (j) incrementing a counter each time said steps (b) through (i) are performed, wherein said steps (b) through (e) are performed with at least the weighted configurations defined by the new sets of weight values created in the immediately preceding said step (i), and said steps (f) through (i) are performed with the sets of eight values selected in the immediately preceding said step (g) and with the new sets of weight values created in the immediately preceding said step (i) until the counter reaches a maximum count value; (k) selecting, once the counter reaches the maximum count value, the set of weight values having a final best score as determined in step (g); and (l) configuring the neural network to have the plurality of nodes interconnected in accordance with the set of weight values having the final best score. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A method of training a neural network, comprising the steps of:
-
(a) configuring a neural network having a plurality of interconnected nodes including an input layer and an output layer, said neural network being capable of receiving data and operative in a plurality of different weighted configurations, each defined by a different set of weight values; (b) adjusting the network to operate in a weighted configuration defined by a set of weight values, and inputting each of a plurality of training patterns to the network to generate evaluations of the respective training patterns; (c) comparing each evaluation to a respective training pattern to a desired output to obtain a corresponding error; (d) determining from all of the errors obtained in said step (c) an overall error value corresponding to the set of weight values; (e) repeating said steps (b), (c) and (d) a plurality of times, each time with a different weighted configuration defined by a respective different set of weight values, to obtain a plurality of overall error values; (f) for each of said sets of weight values, determining a score by selecting respective error comparison values from a predetermined variable probability distribution and comparing thereto the corresponding overall error value; (g) selecting a predetermined number of the sets of weight values determined to have the best scores; (h) generating copies of the sets of weight values selected in said step (g); (i) for each of the copies, (1) generating random numbers from respective continuous random number distributions having a means of zero and having finite variances and (2) mutating the weight values of the copies by adding the random numbers to the weight values, thereby creating new sets of weight values forming progeny of the sets of weight values selected in said step (g); (j) incrementing a counter each time said steps (b) through (i) are performed until the counter reaches a maximum count value, wherein said steps (b) through (e) are performed with at least the weighted configurations defined by the new sets of weight values created in the immediately preceding said step (i), and said steps (f) through (i) are performed with the sets of weight values selected in the immediately preceding said step (g) and with the new sets of weight values created in the immediately preceding said step (i), each said repetition of said step (i1) including the step of selecting by a stochastic process which is independent of the weight values obtained in any preceding performance of said step (i), the variances of the continuous random number distributions from which the random numbers added to weight values of the sets of weight values to create said progeny in said step (i2) are selected; (k) selecting, once the counter reaches the maximum count value, the set of weight values having a final best score as determined in step (g); and (l) configuring the neural network to have the plurality of nodes interconnected in accordance with the set of weight values having the final best score. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35)
-
-
36. A neural network training apparatus, comprising:
-
(a) a plurality of neural networks each capable of receiving data and operative in a plurality of different weighted configurations, each configuration defined by a different set of weight values; (b) means for adjusting each network to operate in different weighted configurations defined by a corresponding different set of weight values; (c) means, responsive to an application of a plurality of training patterns to each of the networks, for generating respective evaluations of the training patterns from each of the networks as outputs of the networks; (d) means for comparing the evaluations of the training patterns to corresponding desired outputs of the networks to obtain corresponding errors; (e) means for determining from all of the errors obtained from said comparing means overall error values corresponding to the sets of weight values; (f) for each of said sets of weight values, means for determining a score by selecting respective error comparison values from a predetermined variable probability distribution and comparing thereto the corresponding overall error value; (g) means for selecting a predetermined number of the sets of weight values determined to have the best scores; (h) means for generating copies of the sets of weight values selected by said selecting means; (i) means for generating, for each weight value of the copies generated by said means for generating copies, a corresponding random number from a continuous random distribution of numbers having a mean of zero and a variance which is a function of the overall error value determined by said overall error generating means for the set of weight values from which the copy which includes said each weight value was generated; and (j) means for mutating the weight values of the copies by adding the corresponding random numbers thereto to create new sets of weight values, said means for adjusting including means for replacing some of the weighted configurations of said network with new weighted configurations based on said new sets of weight values.
-
-
37. An apparatus for training a neural network by adjusting weight values repetitive application of training patterns, comprising:
-
(a) a plurality of neural networks each capable of receiving data and operative in a plurality of different weighted configurations, each configuration defined by a different set of weight values; (b) means for adjusting each network to operate in different weighted configurations defined by a corresponding different set of weight values; (c) first generating means, responsive to an application of a plurality of training patterns to each of the networks, for generating respective evaluations of the training patterns from each of the networks as outputs of the networks; (d) means for comparing the evaluations of the training patterns to corresponding desired outputs of the networks to obtain corresponding errors; (e) means for determining from all of the errors obtained from said comparing means overall error values corresponding to the sets of weight values; (f) for each of said sets of weight values, means for determining a score by selecting respective error comparison values from a predetermined variable probability distribution and comparing thereto the corresponding overall error value; (g) means for selecting a predetermined number of the sets of weight values determined to have the best scores; (h) second generating means for generating copies of the sets of weight values selected by said selecting means; (i) third generating means for generating, by a stochastic process which is independent of weight values defining weighted configurations of the neural network, a corresponding variance value for each weight value of each copy generated by said means for generating copies; (j) fourth generating means for generating, for each weight value of each copy generated by said means for generating copies, a corresponding random number from a continuous random distribution of numbers having a mean of zero and a variance generated by said third generating means; and (k) means for mutating the weight values of the copies by adding the corresponding random numbers thereto to create new sets of weight values, said means for adjusting including means for replacing the weighted configurations of some of said networks with new weighted configurations based on said new sets of weight values.
-
Specification