Neural network for improved classification of patterns which adds a best performing trial branch node to the network

US 5,371,809 A
Filed: 03/30/1992
Issued: 12/06/1994
Est. Priority Date: 03/30/1992
Status: Expired due to Fees

First Claim

Patent Images

1. A network for classification of a plurality of patterns in unknown input data comprising:

a plurality of processing elements, including a plurality of leaf nodes, each for receiving an input signal from a plurality of input nodes and for providing a plurality of output values therefrom to a plurality of output nodes, each processing element having at least one input weight associated with each input signal;

supervision means for comparison of each of said plurality of output values to a known classification for a corresponding training example input signal and for generation of an error signal;

adjustment means for determining changes in each input weight in response to said error signal from said supervision means;

identification means for selecting a leaf node of said plurality which has the greatest potential to decrease said error signal ,said identification means including an accumulation means and a comparison means, said accumulation means for receiving and counting for each of said leaf nodes an activation value comprising the number of times a given leaf node is activated in response to a plurality of training example input signals and said comparison means for comparing said activation value to a first preselected statistical value to test for accept/reject criteria; and

a pool of trial branch nodes within said plurality of processing elements from which a best performing trial branch node is selected and used in place of said leaf node which has the greatest potential to decrease said error signal, said best performing trial branch node branching into two said leaf nodes connected to each of said plurality of output nodes;

wherein said supervision means generates a continue training command when said plurality of output values fails to match said known classification and generates a stop training command when said plurality of output values matches said known classification.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Each processing element has a number of weights for each input connection. These weights are coefficients of a polynomial equation. The use of quadratic nodes permits discrimination between body pixel and edge pixels, in which an intermediate value is present, using a grey scale image. In the training method of the present invention, the middle layer is initially one leaf node which is connected to each output node. The contribution of each leaf node to the total output error is determined and the weights of the inputs to the leaf nodes are adjusted to minimize the error. The leaf node that has the best chance of improving the total output error is then "converted" into a branch node with two leaves. A branch node selected from a pool of trial branch nodes is used to replace the chosen leaf node. The trial branch nodes are then trained by gradient training to optimize the branch error function. From the set of trial branch nodes, the best performing node is selected and is substituted for the previously-selected leaf node. Two new leaf nodes are then created from the newly-substituted best-performing-branch node. A leaf node is accepted or rejected based upon the number of times it was activated related to the correctness of the classification. Once a leaf node is rejected, it is eliminated from any further operation, thereby minimizing the size of the network. Integer mathematics can be generated within the network so that a separate floating point coprocessor is not required.

49 Citations

View as Search Results

17 Claims

1. A network for classification of a plurality of patterns in unknown input data comprising:
- a plurality of processing elements, including a plurality of leaf nodes, each for receiving an input signal from a plurality of input nodes and for providing a plurality of output values therefrom to a plurality of output nodes, each processing element having at least one input weight associated with each input signal;
  
  supervision means for comparison of each of said plurality of output values to a known classification for a corresponding training example input signal and for generation of an error signal;
  
  adjustment means for determining changes in each input weight in response to said error signal from said supervision means;
  
  identification means for selecting a leaf node of said plurality which has the greatest potential to decrease said error signal ,said identification means including an accumulation means and a comparison means, said accumulation means for receiving and counting for each of said leaf nodes an activation value comprising the number of times a given leaf node is activated in response to a plurality of training example input signals and said comparison means for comparing said activation value to a first preselected statistical value to test for accept/reject criteria; and
  
  a pool of trial branch nodes within said plurality of processing elements from which a best performing trial branch node is selected and used in place of said leaf node which has the greatest potential to decrease said error signal, said best performing trial branch node branching into two said leaf nodes connected to each of said plurality of output nodes;
  
  wherein said supervision means generates a continue training command when said plurality of output values fails to match said known classification and generates a stop training command when said plurality of output values matches said known classification.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. A network as in claim 1 wherein each said processing element has a plurality of element inputs and an element output and provides an element output value according to a threshold function applied to said plurality of input signals.
  - 3. A network as in claim 2 wherein said threshold function is a quadratic equation and each said processing element has two input weights.
  - 4. A network as in claim 2 wherein each said processing element comprises:
    - means, responsive to said supervision means, for multiplying each said input signal by a corresponding one of said at least one input weights to form weighted input signals;
      
      means, responsive to said weighted input signals for forming a sum of input signals; and
      
      means for thresholding said sum of input signals by a predetermined continuous threshold function to provide said output values.
  - 5. A network as in claim 4 wherein a range of input weights for each said processing element has a quantized value substantially equal to said range for all other processing elements and each said processing element is treated independently from all other processing elements so that conversion of floating point weights to integer weights is facilitated by rescaling said range for each said processing element.
  - 6. A network as in claim 1 wherein said comparison means includes means for comparing an error rate within said activation value against a correct rate within said activation value.
  - 7. A network as in claim 1 wherein said unknown input data comprises a detected signal generated by a predefined number of pixels wherein a value of each of said pixels is applied to each of said processing elements.
  - 8. A network as in claim 4 wherein said predetermined continuous function is a sigmoid function.
  - 9. A network as in claim 1 wherein a combination of said supervision means and said adjustment means uses gradient descent.
  - 10. A network as in claim 1 wherein a combination of said supervision means and said adjustment means is Quick-prop.

11. A method for training a network for classification of a plurality of patterns in unknown input data comprising:
- selecting a plurality of processing elements for receiving an input signal from a plurality of input nodes and for providing a plurality of output values to a plurality of output nodes with each processing element having at least one input weight associated with each input signal;
  
  performing gradient training on said plurality of output nodes to minimize output error;
  
  identifying a best leaf node within said plurality of processing elements which have the best chance of improving output error by accumulating the number of activations of each of said plurality of leaf nodes and comparing said number of activations to a preselected statistical value to test for compliance with an accept/reject criteria;
  
  selecting a trial branch node with the best performance from a pool of trial branch nodes which have been trained to minimize output error;
  
  substituting said trial branch node for said best leaf node; and
  
  creating two new leaf nodes from the outputs of said trial branch node and testing said two new leaf nodes.
- View Dependent Claims (12, 13, 14, 15, 16, 17)
- - 12. A method as in claim 11 wherein the step of performing gradient training comprises using gradient descent.
  - 13. A method as in claim 11 wherein the step of performing gradient training comprises using Quick-prop.
  - 14. A method as in claim 11 wherein the step of identifying said worst leaf node includes rendering a leaf node of said plurality inactive for further processing if it fails to comply with said accept/reject criteria.
  - 15. A method as in claim 11 wherein the step of selecting a plurality of processing elements includes selecting processing elements with two input weights associated with each signal.
  - 16. A method as in claim 15 wherein the step of selecting processing elements with two input weights includes selecting processing elements wherein a quadratic threshold function is performed on said input signal.
  - 17. A method as in claim 11 further comprising converting floating point weights for said input weights into integer weights for said input weights by rescaling a range of said input weights for each selected processing element independently from all other processing elements such that each said range has a quantized value substantially equal to said range for all other processing elements.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Duane D. Desieno
Original Assignee
Duane D. Desieno
Inventors
Desieno, Duane D.
Primary Examiner(s)
Moore, David K.
Assistant Examiner(s)
Cammarata, Michael

Application Number

US07/859,828
Time in Patent Office

981 Days
Field of Search

382/14, 382/15, 395/21, 395/22, 395/23, 395/24, 395/25, 395/75
US Class Current

382/159
CPC Class Codes

G06F 18/24323 Tree-organised classifiers

G06N 3/082 modifying the architecture,...

Neural network for improved classification of patterns which adds a best performing trial branch node to the network

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

49 Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

Neural network for improved classification of patterns which adds a best performing trial branch node to the network

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

49 Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links