Method for improving neural network architectures using evolutionary algorithms
First Claim
1. A method for enabling a determination of a preferred neural network architecture, the method comprising:
- enabling an encoding of each chromosome of a plurality of chromosomes, each chromosome being associated with each neural network of a plurality of neural networks, each chromosome including;
a first parameter that defines a complete initial condition of the associated neural network at commencement of a training cycle, and a second parameter that defines an architectural feature of the associated neural network, enabling an evaluation of each neural network of the plurality of neural networks based on the initial condition and the architectural feature of each neural network, to provide a measure of effectiveness associated with each chromosome, and enabling a selection of the preferred neural network architecture based on the measure of effectiveness associated with each chromosome.
2 Assignments
0 Petitions
Accused Products
Abstract
The noise associated with conventional techniques for evolutionary improvement of neural network architectures is reduced so that of an optimum architecture can be determined more efficiently and more effectively. Parameters that affect the initialization of a neural network architecture are included within the encoding that is used by an evolutionary algorithm to optimize the neural network architecture. The example initialization parameters include an encoding that determines the initial nodal weights used in each architecture at the commencement of the training cycle. By including the initialization parameters within the encoding used by the evolutionary algorithm, the initialization parameters that have a positive effect on the performance of the resultant evolved network architecture are propagated and potentially improved from generation to generation. Conversely, initialization parameters that, for example, cause the resultant evolved network to be poorly trained, will not be propagated. In accordance with a second aspect of this invention, the encoding also includes parameters that affect the training process, such as the duration of the training cycle, the training inputs applied, and so on. In accordance with a third aspect of this invention, the same set of training or evaluation inputs are applied to all members whose performances are directly compared.
33 Citations
15 Claims
-
1. A method for enabling a determination of a preferred neural network architecture, the method comprising:
-
enabling an encoding of each chromosome of a plurality of chromosomes, each chromosome being associated with each neural network of a plurality of neural networks, each chromosome including;
a first parameter that defines a complete initial condition of the associated neural network at commencement of a training cycle, and a second parameter that defines an architectural feature of the associated neural network, enabling an evaluation of each neural network of the plurality of neural networks based on the initial condition and the architectural feature of each neural network, to provide a measure of effectiveness associated with each chromosome, and enabling a selection of the preferred neural network architecture based on the measure of effectiveness associated with each chromosome. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
an initial node weight associated with a node of the associated neural network, an identification of a training parameter associated with the associated neural network, an index that is used to determine further parameters, and a selector that is used to determine a subset of parameters that are used to initialize the associated neural network.
-
-
3. The method of claim 1, wherein the second parameter includes at least one of:
-
a number of node levels of the associated neural network, a number of nodes at each node level of the associated neural network, and an index that is used to determine further parameters.
-
-
4. The method of claim 1, further including
enabling a training of each neural network of the plurality of neural networks, wherein the training of each neural network is based on a set of training vectors applied to each neural network, the same set of training vectors being applied to each neural network. -
5. The method of claim 1, wherein the evaluation of each neural network is based on a set of evaluation vectors applied to each neural network, the same set of evaluation vectors being applied to each neural network.
-
6. The method of claim 1, further including
enabling a training of each neural network of the plurality of neural networks, and wherein the chromosome further includes: a third parameter that defines a training parameter that affects the training of the associated neural network.
-
7. The method of claim 6, wherein the training parameter includes at least one of:
- a time duration limit, a quantity of input limit, a performance threshold, and an item that affects a selection of training input vectors.
-
8. The method of claim 1, further including:
-
enabling a selection of a plurality of preferred neural network architectures based on the measure of effectiveness associated with each chromosome, enabling a production of a next generation plurality of chromosomes based on the measure of effectiveness associated with each chromosome, each next generation chromosome of the next generation plurality of chromosomes having a determinable corresponding next generation neural network of a plurality of next generation neural networks, enabling an evaluation of each next generation neural network of the plurality of next generation neural networks based on the initial condition and the architectural feature of each next generation neural network, to provide a measure of effectiveness associated with each next generation chromosome, and wherein the selection of the plurality of preferred neural network architectures is further based on the measure of effectiveness associated with each next generation chromosome.
-
-
9. A method for enabling a determination of at least one preferred neural network architecture, the method comprising:
-
enabling a definition of a plurality of first generation network architectures, enabling a selection of a first random set of training input vectors, enabling a training of each network architecture of the plurality of first generation network architectures based on the first random set of training input vectors to form a corresponding plurality of trained first generation network architectures, enabling an evaluation of each trained first generation network architecture of the plurality of trained first generation network architectures to provide a measure of effectiveness associated with each trained first generation network architecture, enabling a definition of a plurality of second generation network architectures, based on the measure of effectiveness associated with each trained first generation network architecture, enabling a selection of a second random set of training input vectors, enabling a training of each network architecture of the plurality of second generation network architectures based on the second random set of training input vectors to form a corresponding plurality of trained second generation network architectures, enabling an evaluation of each trained second generation network architecture of the plurality of trained second generation network architectures to provide a measure of effectiveness associated with each trained second generation network architecture, enabling a selection of the at least one preferred neural network architecture based on the measure of effectiveness associated with each trained second generation network architecture.
-
-
10. A method for enabling a determination of at least one preferred neural network architecture, the method comprising:
-
enabling a definition of a plurality of first generation network architectures, enabling a training of each network architecture of the plurality of first generation network architectures to form a corresponding plurality of trained first generation network architectures, enabling a selection of a first random set of evaluation input vectors, enabling an evaluation of each trained first generation network architecture of the plurality of trained first generation network architectures based on the first random set of evaluation input vectors to provide a measure of effectiveness associated with each trained first generation network architecture, enabling a definition of a plurality of second generation network architectures based on the measure of effectiveness associated with each trained first generation network architecture, enabling a training of each network architecture of the plurality of second generation network architectures to form a corresponding plurality of trained second generation network architectures, enabling a selection of a second random set of evaluation input vectors, enabling an evaluation of each trained second generation network architecture of the plurality of trained second generation network architectures based on the second random set of evaluation input vectors to provide a measure of effectiveness associated with each trained second generation network architecture, enabling a selection of the at least one preferred neural network architecture based on the measure of effectiveness associated with each trained second generation network architecture.
-
-
11. A system comprising:
-
a neural network that provides an output vector in response to an input vector that is applied to the neural network, the output vector being dependent upon an initial condition of the neural network, and an evolutionary algorithm device, operably coupled to the neural network device, that is configured to provide;
a network architecture parameter that affects the neural network and a network initialization parameter that affects the initial condition of the neural network at commencement of a training cycle based on an evaluation of an effectiveness of another output vector provided by the neural network device. - View Dependent Claims (12, 13, 14, 15)
least one input node that receives the input vector, at least one output node that provides the output vector, and at least one intermediate node, operably coupled to the at least one input node and the at least one output node, that communicates an effect from the at least one input node to the at least one output node, the effect being dependent upon a nodal weight factor associated with the at least one intermediate node, and wherein;
the initialization parameter includes an initial value of the nodal weight factor.
-
-
13. The system of claim 12, wherein the evolutionary algorithm device comprises:
-
a performance evaluator that determines the effectiveness of the other output vector, an offspring generator, operably coupled to the performance evaluator, that determines the network architecture parameter and the network initialization parameter based on the effectiveness of the other output vector.
-
-
14. The system of claim 13, wherein the evolutionary algorithm device further comprises
a selector that selects a better performing network based on the effectiveness of the other output vector, and wherein the offspring generator determines the network architecture parameter and the network initialization parameter based on an architecture parameter and an initialization parameter of the better performing network. -
15. The system of claim 11, wherein
the neural network device includes a training mode, wherein parameters of the neural network are affected by a training set of input vectors, and the evolutionary algorithm device further provides a training parameter that affects the training mode of the network architecture, based on an evaluation of the effectiveness of the other output vector.
Specification