Method and apparatus for evolving a neural network
First Claim
1. A method of evolving a neural network comprising a plurality of processing elements interconnected by a plurality of weighted connections, comprising the steps of:
- a) obtaining a definition for said neural network by evolving a plurality of weights for said plurality of weighted connections, and evolving a plurality of activation function parameters associated with said plurality of processing elements, b) determining, based upon a first activation function parameter of said plurality of activation function parameters, whether a first processing element of said plurality of processing elements may be removed from said neural network to simplify the definition for said neural network, and c) updating said definition for said neural network by removing said first processing element from said definition of said neural network in response to determining that said first processing element may be removed.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of evolving a neural network that includes a plurality of processing elements interconnected by a plurality of weighted connections includes the step of obtaining a definition for the neural network by evolving a plurality of weights for the plurality of weighted connections, and evolving a plurality of activation function parameters associated with the plurality of processing elements. Another step of the method includes determining whether the definition for the neural network may be simplified based upon at least one activation function parameter of the plurality of activation function parameters. Yet another step of the method includes updating the definition for the neural network in response to determining that the definition for the neural network may be simplified. The method utilizes particle swarm optimization techniques to evolve the plurality of weights and the plurality of activation parameters. Moreover, the method simplifies activation functions of processing elements in response to corresponding activation parameters meeting certain criteria, and removes processing elements from the definition of the neural network in response to corresponding activation parameters satisfying certain criteria. Various apparatus are also disclosed for implementing network evolution and simplification.
62 Citations
20 Claims
-
1. A method of evolving a neural network comprising a plurality of processing elements interconnected by a plurality of weighted connections, comprising the steps of:
-
a) obtaining a definition for said neural network by evolving a plurality of weights for said plurality of weighted connections, and evolving a plurality of activation function parameters associated with said plurality of processing elements, b) determining, based upon a first activation function parameter of said plurality of activation function parameters, whether a first processing element of said plurality of processing elements may be removed from said neural network to simplify the definition for said neural network, and c) updating said definition for said neural network by removing said first processing element from said definition of said neural network in response to determining that said first processing element may be removed. - View Dependent Claims (3)
step b) further comprises the step of determining that a first processing element of said plurality of processing elements substantially generates a substantially constant output signal regardless of received input signals based upon a first slope factor of a said plurality of activation functions parameters that is associated with a first activation function implemented by said first processing element; and
step c) further comprises the steps of;
c1) removing said first processing element from said definition of said neural network in response to determining that said first processing element generates said substantially constant output signal, and c2) updating biasing weighted connections of said plurality of weighted connections associated with a biasing processing element of said plurality of processing elements based upon first weighted connections of said plurality of weighted connections associated with said first processing element in order to substantially reproduce an effect said first processing element had on said plurality of processing elements prior to said first processing element being removed from said definition of said neural network.
-
-
2. A method of evolving a neural network comprising a plurality of processing elements interconnected by a plurality of weighted connections, comprising the steps of:
-
a) obtaining a definition for said neural network by evolving a plurality of weights for said plurality of weighted connections, and evolving a plurality of activation function parameters associated with said plurality of processing elements, b) determining whether said definition for said neural network may be simplified by determining whether a first slope factor of said plurality of activation function parameters has a predetermined relationship to a slope threshold, and c) updating said definition for said neural network by removing a first processing element implementing a sigmoid activation function from said definition of said neural network in response to determining that said first slope factor has said predetermined relationship to said slope threshold.
-
-
4. A method of evolving a neural network comprising a plurality of processing elements interconnected by a plurality of weighted connections, comprising the steps of:
-
a) obtaining a definition for said neural network by evolving a plurality of weights for said plurality of weighted connections, and evolving a plurality of activation function parameters associated with said plurality of processing elements, said obtaining the definition for said neural network further comprising a1) initializing a swarm of particles in which each particle has a position in a hyperspace that represents a separate definition for said neural network, and a velocity vector that represents motion of said particle through said hyperspace, a2) determining for each particle of said swarm, a fitness value for said respective definition of said neural network, a3) determining based upon said fitness values whether termination criteria have been satisfied, a4) updating for said each particle of said swarm, a personal best value and a personal best position based upon said respective fitness value for said each particle, a5) updating for said each particle of said swarm, a local best value and a local best position based upon fitness values associated with a respective group of said particles, a6) updating for said each particle of said swarm, said position and said velocity vector for said particle based upon said personal best position for said particle, said local best position for said particle, and said velocity vector for said particle, and a7) repeating steps a2), a3), a4), a5), and a6) until said termination criteria have been satisfied;
b) determining whether said definition for said neural network may be simplified based upon at least one activation function parameter of said plurality of activation function parameters; and
c) updating said definition for said neural network in response to determining that said definition for said neural network may be simplified. - View Dependent Claims (5)
updating said each particle of said swarm such that said velocity vector has more effect on early updates of said each particle than said velocity vector has on later updates of said each particle.
-
-
6. A computer readable medium for evolving a neural network comprising a plurality of processing elements interconnected by a plurality of weighted connections, said computer readable medium comprising code which when executed by a network evolution system causes said network evolution system to:
-
obtain a definition for said neural network by evolving a plurality of weights for said plurality of weighted connections, and evolving a plurality of activation function parameters associated with said plurality of processing elements;
determine whether said definition for said neural network may be simplified based upon at least one activation function parameter of said plurality of activation function parameters; and
update said definition for said neural network in response to determining that said definition for said neural network may be simplified. - View Dependent Claims (7, 8, 9, 10, 11, 12, 13)
determine whether said definition for said neural network may be simplified by determining, based upon a first activation function parameter of said plurality of activation parameters, whether a first activation function of a first processing element of said plurality of processing elements may be implemented with a less complex activation function, and update said definition for said neural network by replacing said first activation function with said less complex activation function in response to determining that said first activation function may be implemented with said less complex activation function.
-
-
8. The computer readable medium of claim 6, wherein said code when executed by said network evolution system further causes said network evolution system to:
-
determine whether said definition for said neural network may be simplified by determining whether a first slope factor of said plurality of activation function parameters has a predetermined relationship to a slope threshold, and update said definition for said neural network by replacing a sigmoid activation function of a first processing element with a step activation function in response to determining that said first slope factor has said predetermined relationship to said slope threshold.
-
-
9. The computer readable medium of claim 6, wherein said code when executed by said network evolution system further causes said network evolution system to:
-
determine whether said definition for said neural network may be simplified by determining, based upon a first activation function parameter of said plurality of activation parameters, whether a first processing element of said plurality of processing elements may be removed from said neural network, and update said definition for said neural network by removing said first processing element from said definition of said neural network in response to determining that said first processing element may be removed.
-
-
10. The computer readable medium of claim 6, wherein said code when executed by said network evolution system further causes said network evolution system to:
-
determine whether said definition for said neural network may be simplified by determining whether a first slope factor of said plurality of activation function parameters has a predetermined relationship to a slope threshold, and update said definition for said neural network by removing a first processing element implementing a sigmoid activation function from said definition of said neural network in response to determining that said first slope factor has said predetermined relationship to said slope threshold.
-
-
11. The computer readable medium of claim 6, wherein said code when executed by said network evolution system further causes said network evolution system to:
-
determine whether said definition for said neural network may be simplified by determining that a first processing element of said plurality of processing elements substantially generates a substantially constant output signal regardless of received input signals based upon a first slope factor of a said plurality of activation functions parameters that is associated with a first activation function implemented by said first processing element; and
update said definition for said neural network by (i) removing said first processing element from said definition of said neural network in response to determining that said first processing element generates said substantially constant output signal, and (ii) updating biasing weighted connections of said plurality of weighted connections associated with a biasing processing element of said plurality of processing elements based upon first weighted connections of said plurality of weighted connections associated with said first processing element in order to substantially reproduce an effect said first processing element had on said plurality of processing elements prior to said first processing element being removed from said definition of said neural network.
-
-
12. The computer readable medium of claim 6, wherein said code when executed by said network evolution system further causes said network evolution system to obtain a definition for said neural network by:
-
a) initializing a swarm of particles in which each particle has a position in a hyperspace that represents a separate definition for said neural network, and a velocity vector that represents motion of said particle through said hyperspace;
b) determining for each particle of said swarm, a fitness value for said respective definition of said neural network;
c) determining based upon said fitness values whether termination criteria have been satisfied;
d) updating for said each particle of said swarm, a personal best value and a personal best position based upon said respective fitness value for said each particle;
e) updating for said each particle of said swarm, a local best value and a local best position based upon fitness values associated with a respective group of said particles;
f) updating for said each particle of said swarm, said position and said velocity vector for said particle based upon said personal best position for said particle, said local best position for said particle, and said velocity vector for said particle; and
g) repeating b), c), d), e), and f) until said termination criteria have been satisfied.
-
-
13. The computer readable medium of claim 12, wherein said code when executed by said network evolution system further causes said network evolution system to update said position and said velocity vector for said each particle of said swarm by:
updating said each particle of said swarm such that said velocity vector has more effect on early updates of said each particle than said velocity vector has on later updates of said each particle.
-
14. A network evolution system for evolving a neural network comprising a plurality of processing elements interconnected by a plurality of weighted connections, said network evolution system comprising:
-
a network evolver operable to obtain a definition for said neural network by evolving a plurality of weights for said plurality of weighted connections, and evolving a plurality of activation function parameters associated with said plurality of processing elements; and
a network simplifier operable to (i) determine whether said definition for said neural network may be simplified based upon at least one activation function parameter of said plurality of activation function parameters, and (ii) update said definition for said neural network in response to determining that said definition for said neural network may be simplified. - View Dependent Claims (15, 16, 17, 18, 19, 20)
determine whether said definition for said neural network may be simplified by determining, based upon a first activation function parameter of said plurality of activation parameters, whether a first activation function of a first processing element of said plurality of processing elements may be implemented with a less complex activation function, and update said definition of said neural network by replacing said first activation function with said less complex activation function in response to determining that said first activation function may be implemented with said less complex activation function.
-
-
16. The network evolution system of claim 14, wherein said network simplifier is further operable to:
-
determine whether said definition for said neural network may be simplified by determining whether a first slope factor of said plurality of activation function parameters has a predetermined relationship to a slope threshold, and update said definition of said neural network by replacing a sigmoid activation function of a first processing element with a step activation function in response to determining that said first slope factor has said predetermined relationship to said slope threshold.
-
-
17. The network evolution system of claim 14, wherein said network simplifier is further operable to:
-
determine whether said definition for said neural network may be simplified by determining, based upon a first activation function parameter of said plurality of activation parameters, whether a first processing element of said plurality of processing elements may be removed from said neural network, and update said definition of said neural network by removing said first processing element from said definition of said neural network in response to determining that said first processing element may be removed.
-
-
18. The network evolution system of claim 14, wherein said network simplifier is further operable to:
-
determine whether said definition for said neural network may be simplified by determining whether a first slope factor of said plurality of activation function parameters has a predetermined relationship to a slope threshold, and update said definition of said neural network by removing a first processing element implementing a sigmoid activation function from said definition of said neural network in response to determining that said first slope factor has said predetermined relationship to said slope threshold.
-
-
19. The network evolution system of claim 14, wherein said network simplifier is further operable to:
-
determine whether said definition for said neural network may be simplified by determining that a first processing element of said plurality of processing elements substantially generates a substantially constant output signal regardless of received input signals based upon a first slope factor of a said plurality of activation functions parameters that is associated with a first activation function implemented by said first processing element, and update said definition of said neural network by (i) removing said first processing element from said definition of said neural network in response to determining that said first processing element generates said substantially constant output signal, and (ii) updating biasing weighted connections of said plurality of weighted connections associated with a biasing processing element of said plurality of processing elements based upon first weighted connections of said plurality of weighted connections associated with said first processing element in order to substantially reproduce an effect said first processing element had on said plurality of processing elements prior to said first processing element being removed from said definition of said neural network.
-
-
20. The network evolution system of claim 14, wherein said network evolver is further operable to obtain said definition of said neural network by:
-
a) initializing a swarm of particles in which each particle has a position in a hyperspace that represents a separate definition for said neural network, and a velocity vector that represents motion of said particle through said hyperspace;
b) determining for each particle of said swarm, a fitness value for said respective definition of said neural network;
c) determining based upon said fitness values whether termination criteria have been satisfied;
d) updating for said each particle of said swarm, a personal best value and a personal best position based upon said respective fitness value for said each particle;
e) updating for said each particle of said swarm, a local best value and a local best position based upon fitness values associated with a respective group of said particles;
f) updating for said each particle of said swarm, said position and said velocity vector for said particle based upon said personal best position for said particle, said local best position for said particle, and said velocity vector for said particle; and
g) repeating b), c), d), e), and f) until said termination criteria have been satisfied.
-
Specification