Neural network for improved classification of patterns which adds a best performing trial branch node to the network
First Claim
1. A network for classification of a plurality of patterns in unknown input data comprising:
- a plurality of processing elements, including a plurality of leaf nodes, each for receiving an input signal from a plurality of input nodes and for providing a plurality of output values therefrom to a plurality of output nodes, each processing element having at least one input weight associated with each input signal;
supervision means for comparison of each of said plurality of output values to a known classification for a corresponding training example input signal and for generation of an error signal;
adjustment means for determining changes in each input weight in response to said error signal from said supervision means;
identification means for selecting a leaf node of said plurality which has the greatest potential to decrease said error signal ,said identification means including an accumulation means and a comparison means, said accumulation means for receiving and counting for each of said leaf nodes an activation value comprising the number of times a given leaf node is activated in response to a plurality of training example input signals and said comparison means for comparing said activation value to a first preselected statistical value to test for accept/reject criteria; and
a pool of trial branch nodes within said plurality of processing elements from which a best performing trial branch node is selected and used in place of said leaf node which has the greatest potential to decrease said error signal, said best performing trial branch node branching into two said leaf nodes connected to each of said plurality of output nodes;
wherein said supervision means generates a continue training command when said plurality of output values fails to match said known classification and generates a stop training command when said plurality of output values matches said known classification.
0 Assignments
0 Petitions
Accused Products
Abstract
Each processing element has a number of weights for each input connection. These weights are coefficients of a polynomial equation. The use of quadratic nodes permits discrimination between body pixel and edge pixels, in which an intermediate value is present, using a grey scale image. In the training method of the present invention, the middle layer is initially one leaf node which is connected to each output node. The contribution of each leaf node to the total output error is determined and the weights of the inputs to the leaf nodes are adjusted to minimize the error. The leaf node that has the best chance of improving the total output error is then "converted" into a branch node with two leaves. A branch node selected from a pool of trial branch nodes is used to replace the chosen leaf node. The trial branch nodes are then trained by gradient training to optimize the branch error function. From the set of trial branch nodes, the best performing node is selected and is substituted for the previously-selected leaf node. Two new leaf nodes are then created from the newly-substituted best-performing-branch node. A leaf node is accepted or rejected based upon the number of times it was activated related to the correctness of the classification. Once a leaf node is rejected, it is eliminated from any further operation, thereby minimizing the size of the network. Integer mathematics can be generated within the network so that a separate floating point coprocessor is not required.
49 Citations
17 Claims
-
1. A network for classification of a plurality of patterns in unknown input data comprising:
-
a plurality of processing elements, including a plurality of leaf nodes, each for receiving an input signal from a plurality of input nodes and for providing a plurality of output values therefrom to a plurality of output nodes, each processing element having at least one input weight associated with each input signal; supervision means for comparison of each of said plurality of output values to a known classification for a corresponding training example input signal and for generation of an error signal; adjustment means for determining changes in each input weight in response to said error signal from said supervision means; identification means for selecting a leaf node of said plurality which has the greatest potential to decrease said error signal ,said identification means including an accumulation means and a comparison means, said accumulation means for receiving and counting for each of said leaf nodes an activation value comprising the number of times a given leaf node is activated in response to a plurality of training example input signals and said comparison means for comparing said activation value to a first preselected statistical value to test for accept/reject criteria; and a pool of trial branch nodes within said plurality of processing elements from which a best performing trial branch node is selected and used in place of said leaf node which has the greatest potential to decrease said error signal, said best performing trial branch node branching into two said leaf nodes connected to each of said plurality of output nodes; wherein said supervision means generates a continue training command when said plurality of output values fails to match said known classification and generates a stop training command when said plurality of output values matches said known classification. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method for training a network for classification of a plurality of patterns in unknown input data comprising:
-
selecting a plurality of processing elements for receiving an input signal from a plurality of input nodes and for providing a plurality of output values to a plurality of output nodes with each processing element having at least one input weight associated with each input signal; performing gradient training on said plurality of output nodes to minimize output error; identifying a best leaf node within said plurality of processing elements which have the best chance of improving output error by accumulating the number of activations of each of said plurality of leaf nodes and comparing said number of activations to a preselected statistical value to test for compliance with an accept/reject criteria; selecting a trial branch node with the best performance from a pool of trial branch nodes which have been trained to minimize output error; substituting said trial branch node for said best leaf node; and creating two new leaf nodes from the outputs of said trial branch node and testing said two new leaf nodes. - View Dependent Claims (12, 13, 14, 15, 16, 17)
-
Specification