Method and apparatus for training a neural network using evolutionary programming

US 5,214,746 A
Filed: 06/17/1991
Issued: 05/25/1993
Est. Priority Date: 06/17/1991
Status: Expired due to Fees

First Claim

Patent Images

1. A method of training a neural network to evaluate data, comprising the steps of:

(a) configuring a neural network having a plurality of interconnected nodes including an input layer and an output layer, said neutral network being capable of receiving training patterns at said input layer and operative in a plurality of different weighted configurations, each defined by a different set of weight values;

(b) selecting a set of weight values and adjusting the network to operate in a weighted configuration defined by said set of weight values, and inputting each of a plurality of training patterns to said input layer to generate respective evaluations of the training patterns as outputs of the network at said output layer;

(c) comparing each evaluation of a respective training pattern to a desired output of the network to obtain a corresponding error;

(d) determining from all of the errors obtained in said step (c) an overall error value corresponding to the set of weight values;

(e) repeating said steps (b), (c) and (d) a plurality of times, each time with a different weighted configuration defined by a respective different set of weight values, to obtain a plurality of overall error values;

(f) for each of said sets of weight values, determining a score by selecting respective error comparison values from a predetermined variable probability distribution and comparing thereto the corresponding overall error value;

(g) selecting a predetermined number of the sets of weight values determined to have the best scores;

(h) generating copies of the sets of weight values selected in said step (g);

(i) for each of the copies, perturbing the weight values thereof generated in said step (h) by adding random numbers to the weight values to create a new set of weight values, the random numbers being obtained randomly from a continuous random distribution of number having a mean of zero and a variance which is a function of the overall error value determined for the set of weight values from which the copy was generated;

(j) incrementing a counter each time said steps (b) through (i) are performed, wherein said steps (b) through (e) are performed with at least the weighted configurations defined by the new sets of weight values created in the immediately preceding said step (i), and said steps (f) through (i) are performed with the sets of eight values selected in the immediately preceding said step (g) and with the new sets of weight values created in the immediately preceding said step (i) until the counter reaches a maximum count value;

(k) selecting, once the counter reaches the maximum count value, the set of weight values having a final best score as determined in step (g); and

(l) configuring the neural network to have the plurality of nodes interconnected in accordance with the set of weight values having the final best score.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and apparatus for training neural networks using evolutionary programming. A network is adjusted to operate in a weighted configuration defined by a set of weight values and a plurality of training patterns are input to the network to generate evaluations of the training patterns as network outputs. Each evaluation is compared to a desired output to obtain a corresponding error. From all of the errors, an overall error value corresponding to the set of weight values is determined. The above steps are repeated with different weighted configurations to obtain a plurality of overall error values. Then, for each set of weight values, a score is determined by selecting error comparison values from a predetermined variable probability distribution and comparing them to the corresponding overall error value. A predetermined number of the sets of weight values determined to have the best scores are selected and copies are made. The copies are mutated by adding random numbers to their weights and the above steps are repeated with the best sets and the mutated copies defining the weighted configurations. This procedure is repeated until the overall error values diminish to below an acceptable threshold. The random numbers added to the weight values of copies are obtained from a continuous random distribution of numbers having zero mean and variance determined such that it would be expected to converge to zero as the different sets of weight values in successive iterations converge toward sets of weight values yielding the desired neural network performance.

Citations

37 Claims

1. A method of training a neural network to evaluate data, comprising the steps of:
- (a) configuring a neural network having a plurality of interconnected nodes including an input layer and an output layer, said neutral network being capable of receiving training patterns at said input layer and operative in a plurality of different weighted configurations, each defined by a different set of weight values;
  
  (b) selecting a set of weight values and adjusting the network to operate in a weighted configuration defined by said set of weight values, and inputting each of a plurality of training patterns to said input layer to generate respective evaluations of the training patterns as outputs of the network at said output layer;
  
  (c) comparing each evaluation of a respective training pattern to a desired output of the network to obtain a corresponding error;
  
  (d) determining from all of the errors obtained in said step (c) an overall error value corresponding to the set of weight values;
  
  (e) repeating said steps (b), (c) and (d) a plurality of times, each time with a different weighted configuration defined by a respective different set of weight values, to obtain a plurality of overall error values;
  
  (f) for each of said sets of weight values, determining a score by selecting respective error comparison values from a predetermined variable probability distribution and comparing thereto the corresponding overall error value;
  
  (g) selecting a predetermined number of the sets of weight values determined to have the best scores;
  
  (h) generating copies of the sets of weight values selected in said step (g);
  
  (i) for each of the copies, perturbing the weight values thereof generated in said step (h) by adding random numbers to the weight values to create a new set of weight values, the random numbers being obtained randomly from a continuous random distribution of number having a mean of zero and a variance which is a function of the overall error value determined for the set of weight values from which the copy was generated;
  
  (j) incrementing a counter each time said steps (b) through (i) are performed, wherein said steps (b) through (e) are performed with at least the weighted configurations defined by the new sets of weight values created in the immediately preceding said step (i), and said steps (f) through (i) are performed with the sets of eight values selected in the immediately preceding said step (g) and with the new sets of weight values created in the immediately preceding said step (i) until the counter reaches a maximum count value;
  
  (k) selecting, once the counter reaches the maximum count value, the set of weight values having a final best score as determined in step (g); and
  
  (l) configuring the neural network to have the plurality of nodes interconnected in accordance with the set of weight values having the final best score.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
- - 2. A method of training a neural network according to claim 1, wherein said step (f) includes the steps of randomly selecting a subquantity of the plurality of overall error values determined in said steps (b), (c), (d), and (e), comparing each of the plurality of overall error values to each overall error value of the randomly selected subquantity, and scoring each set of weight values according to the comparisons of the corresponding overall error values with the overall error values of the subquantity.
  - 3. A method of training a neural network according to claim 2, wherein all of the comparisons performed in said step of comparing are performed in parallel and all of the scoring performed in said step of scoring are performed in parallel.
  - 4. A method of training a neural network according to claim 1, wherein said step (d) comprises determining a mean square error value corresponding to the set of weight values as the overall error value.
  - 5. A method of training a neural network according to claim 4, wherein said step (i) includes perturbing the weight values by adding random numbers thereto obtained from a Gaussian distribution whose variance is proportional to the mean square error value.
  - 6. A method for training a neural network according to claim 1, wherein said step (i) includes, for each copy generated in said step (h), the step of generating a random distribution of numbers having a mean of zero and a variance which is proportional to the overall error value corresponding to said each copy, and adding to each weight value of said copy a number randomly selected from said random distribution to create said new sets of weight values.
  - 7. A method of training a neural network according to claim 6, wherein said step of generating a random distribution of numbers comprises generating a Gaussian distribution of numbers whose variance is proportional to the overall error value.
  - 8. A method of training a neural network according to claim 1, wherein said step (g) includes selecting a number of sets of weight values equal to one plus one half of the number of times that steps (b), (c) and (d) are repeated according to step (e).
  - 9. A method of training a neural network according to claim 1, wherein said neural network comprises a plurality of neural networks equal in number to the number of training patterns, each neural network being capable of receiving training patterns and operative in a plurality of different weighted configurations, and wherein in said step (b), the training patterns are input to respective ones of said neural networks in parallel and evaluated in parallel to generate the evaluations of the respective training patterns in parallel, and in said step (c), the evaluations are compared to respective desired outputs in parallel to obtain corresponding errors in parallel.
  - 10. A method of training a neural network according to claim 9, wherein in said step (b), for each network, the weighted configuration is adjusted by adjusting all of the weight values thereof in parallel.
  - 11. A method of training a neural network according to claim 9, wherein the repetitions of said steps (b), (c) and (d) a plurality of times, each time with a different weighted configuration, in said step (e) are performed in parallel.
  - 12. A method of training a neural network according to claim 11, wherein the scoring of said sets in said step (f) are performed in parallel
  - 13. A method of training a neural network according to claim 12, wherein the sets are copied in parallel in said step (h).
  - 14. A method of training a neural network according to claim 13, wherein the copies are mutated in parallel in said step (i).
  - 15. A method of training a neural network according to claim 14, wherein the weight values of each set are copied in parallel in said step (h).
  - 16. A method of training a neural network according to claim 15, wherein the weight values of each copy are mutated in parallel in said step (i).
  - 17. A method of training a neural network according to claim 1, wherein the repetitions of said steps (b), (c) and (d) a plurality of times, each time with a different weighted configuration, in step (e) are performed in parallel.
  - 18. A method of training a neural network according to claim 1, wherein the scoring of said sets in said step (f) is performed in parallel.

19. A method of training a neural network, comprising the steps of:
- (a) configuring a neural network having a plurality of interconnected nodes including an input layer and an output layer, said neural network being capable of receiving data and operative in a plurality of different weighted configurations, each defined by a different set of weight values;
  
  (b) adjusting the network to operate in a weighted configuration defined by a set of weight values, and inputting each of a plurality of training patterns to the network to generate evaluations of the respective training patterns;
  
  (c) comparing each evaluation to a respective training pattern to a desired output to obtain a corresponding error;
  
  (d) determining from all of the errors obtained in said step (c) an overall error value corresponding to the set of weight values;
  
  (e) repeating said steps (b), (c) and (d) a plurality of times, each time with a different weighted configuration defined by a respective different set of weight values, to obtain a plurality of overall error values;
  
  (f) for each of said sets of weight values, determining a score by selecting respective error comparison values from a predetermined variable probability distribution and comparing thereto the corresponding overall error value;
  
  (g) selecting a predetermined number of the sets of weight values determined to have the best scores;
  
  (h) generating copies of the sets of weight values selected in said step (g);
  
  (i) for each of the copies, (1) generating random numbers from respective continuous random number distributions having a means of zero and having finite variances and (2) mutating the weight values of the copies by adding the random numbers to the weight values, thereby creating new sets of weight values forming progeny of the sets of weight values selected in said step (g);
  
  (j) incrementing a counter each time said steps (b) through (i) are performed until the counter reaches a maximum count value, wherein said steps (b) through (e) are performed with at least the weighted configurations defined by the new sets of weight values created in the immediately preceding said step (i), and said steps (f) through (i) are performed with the sets of weight values selected in the immediately preceding said step (g) and with the new sets of weight values created in the immediately preceding said step (i), each said repetition of said step (i1) including the step of selecting by a stochastic process which is independent of the weight values obtained in any preceding performance of said step (i), the variances of the continuous random number distributions from which the random numbers added to weight values of the sets of weight values to create said progeny in said step (i2) are selected;
  
  (k) selecting, once the counter reaches the maximum count value, the set of weight values having a final best score as determined in step (g); and
  
  (l) configuring the neural network to have the plurality of nodes interconnected in accordance with the set of weight values having the final best score.
- View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35)
- - 20. A method of training a neural network according to claim 19, wherein said step (f) includes the steps of randomly selecting a subquantity of the plurality of overall error values determined in said steps (b), (c), (d), and (e), comparing each of the plurality of overall error values to each of the overall error values in said subquantity, scoring each set of weight values according to the comparisons of the corresponding overall error values with the overall error values of the subquantity, and selecting the sets of weight values having the best scores.
  - 21. A method of training a neural network according to claim 19, wherein said step (d) comprises determining a mean square error value corresponding to the set of weight values as the overall error value.
  - 22. A method of training a neural network according to claim 19, wherein for any new set of weight values created in said immediately preceding said step (i), which is a progeny of a set of weight values created in a still earlier repetition of said step (i), the step of selecting the variance in said step (i1) includes the steps of:
    - (i1a) selecting a number from a random distribution of zero mean and a variance which is a function of the variance of the continuous random number distribution from which the random numbers added to weight values of a set of weight values to create said progeny in said still earlier repetition of said step (i) were selected; and
      
      (i1b) adding together(A) said variance of the continuous random number distribution from which the random numbers added to weight values of a set of weight values to create said progeny in said still earlier repetition of said step (i) were selected and(B) the number selected in said step (i1a) to obtain the selected variance.
  - 23. A method of training a neural network according to claim 22, wherein said step (i1a) includes the step of selecting the number from a Gaussian distribution of zero mean and a variance which is a proportional to the variance of the continuous random number distribution from which the random numbers added to weight values of a set of weight values to create said progeny in said still earlier repetition of said step (i) were selected.
  - 24. A method of training a neural network according to claim 19, wherein the predetermined number of the sets of weight values selected in said step (g) is equal to one plus half of the number of times that steps (b)-(d) are repeated according to step (e).
  - 25. A method of training a neural network according to claim 19, wherein said neural network comprises a plurality of neural networks equal in number to the number of training patterns, each neural network being capable of receiving training patterns and operative in a plurality of different weighted configurations, and wherein in said step (b), the training patterns are input to respective ones of said neural networks in parallel and evaluated in parallel to generate the evaluations of the respective training patterns in parallel, and in said step (c), the evaluations are compared to respective desired outputs in parallel to obtain corresponding errors in parallel.
  - 26. A method of training a neural network according to claim 25, wherein the repetitions of said steps (b), (c) and (d) a plurality of times, each time with a different weighted configuration, in step (e) are performed in parallel.
  - 27. A method of training a neural network according to claim 26, wherein the scoring of said sets in said step (f) are performed in parallel.
  - 28. A method of training a neural network according to claim 27, wherein the sets are copied in parallel in said step (h).
  - 29. A method of training a neural network according to claim 28, wherein the copies are mutated in parallel in said step (i).
  - 30. A method of training a neural network according to claim 28, wherein the weight values of each set are copied in parallel in said step (h).
  - 31. A method of training a neural network according to claim 30, wherein the weight values of each copy are mutated in parallel in said step (i).
  - 32. A method of training a neural network according to claim 25, wherein in said step (b), for each network, the weighted configuration is adjusted by adjusting all of the weight values thereof in parallel.
  - 33. A method of training a neural network according to claim 19, wherein the repetitions of said steps (b), (c) and (d) a plurality of times, each time with a different weighted configuration, in step (e) are performed in parallel.
  - 34. A method of training a neural network according to claim 19, wherein the scoring of said sets in said step (f) are performed in parallel.
  - 35. A method of training a neural network according to claim 19, wherein all of the comparisons performed in said step of comparing are performed in parallel and all of the scoring performed in said step of scoring is performed in parallel.

36. A neural network training apparatus, comprising:
- (a) a plurality of neural networks each capable of receiving data and operative in a plurality of different weighted configurations, each configuration defined by a different set of weight values;
  
  (b) means for adjusting each network to operate in different weighted configurations defined by a corresponding different set of weight values;
  
  (c) means, responsive to an application of a plurality of training patterns to each of the networks, for generating respective evaluations of the training patterns from each of the networks as outputs of the networks;
  
  (d) means for comparing the evaluations of the training patterns to corresponding desired outputs of the networks to obtain corresponding errors;
  
  (e) means for determining from all of the errors obtained from said comparing means overall error values corresponding to the sets of weight values;
  
  (f) for each of said sets of weight values, means for determining a score by selecting respective error comparison values from a predetermined variable probability distribution and comparing thereto the corresponding overall error value;
  
  (g) means for selecting a predetermined number of the sets of weight values determined to have the best scores;
  
  (h) means for generating copies of the sets of weight values selected by said selecting means;
  
  (i) means for generating, for each weight value of the copies generated by said means for generating copies, a corresponding random number from a continuous random distribution of numbers having a mean of zero and a variance which is a function of the overall error value determined by said overall error generating means for the set of weight values from which the copy which includes said each weight value was generated; and
  
  (j) means for mutating the weight values of the copies by adding the corresponding random numbers thereto to create new sets of weight values, said means for adjusting including means for replacing some of the weighted configurations of said network with new weighted configurations based on said new sets of weight values.

37. An apparatus for training a neural network by adjusting weight values repetitive application of training patterns, comprising:
- (a) a plurality of neural networks each capable of receiving data and operative in a plurality of different weighted configurations, each configuration defined by a different set of weight values;
  
  (b) means for adjusting each network to operate in different weighted configurations defined by a corresponding different set of weight values;
  
  (c) first generating means, responsive to an application of a plurality of training patterns to each of the networks, for generating respective evaluations of the training patterns from each of the networks as outputs of the networks;
  
  (d) means for comparing the evaluations of the training patterns to corresponding desired outputs of the networks to obtain corresponding errors;
  
  (e) means for determining from all of the errors obtained from said comparing means overall error values corresponding to the sets of weight values;
  
  (f) for each of said sets of weight values, means for determining a score by selecting respective error comparison values from a predetermined variable probability distribution and comparing thereto the corresponding overall error value;
  
  (g) means for selecting a predetermined number of the sets of weight values determined to have the best scores;
  
  (h) second generating means for generating copies of the sets of weight values selected by said selecting means;
  
  (i) third generating means for generating, by a stochastic process which is independent of weight values defining weighted configurations of the neural network, a corresponding variance value for each weight value of each copy generated by said means for generating copies;
  
  (j) fourth generating means for generating, for each weight value of each copy generated by said means for generating copies, a corresponding random number from a continuous random distribution of numbers having a mean of zero and a variance generated by said third generating means; and
  
  (k) means for mutating the weight values of the copies by adding the corresponding random numbers thereto to create new sets of weight values, said means for adjusting including means for replacing the weighted configurations of some of said networks with new weighted configurations based on said new sets of weight values.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Orincon Corporation (Martin Marietta Corporation)
Original Assignee
Orincon Corporation (Martin Marietta Corporation)
Inventors
Fogel, David B., Fogel, Lawrence J.
Primary Examiner(s)
Fleming, Michael R.
Assistant Examiner(s)
Downs, Robert W.

Application Number

US07/716,687
Time in Patent Office

708 Days
Field of Search

395/23, 395/13
US Class Current

706/25
CPC Class Codes

G06N 3/086 using evolutionary algorith...

Method and apparatus for training a neural network using evolutionary programming

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

37 Claims

Specification

Solutions

Use Cases

Quick Links

Method and apparatus for training a neural network using evolutionary programming

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

37 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links