Fuzzy expert system for interpretable rule extraction from neural networks

US 6,564,198 B1
Filed: 02/16/2000
Issued: 05/13/2003
Est. Priority Date: 02/16/2000
Status: Expired due to Fees

First Claim

Patent Images

1. A method for interpretable rule extraction from neural networks comprising the steps of:

a. providing a neural network having a latent variable space and an error rate, said neural network further including a sigmoid activation function having an adjustable gain parameter λ

;

b. iteratively adjusting the adjustable gain parameter λ

to minimize the error rate of the neural network, producing an estimated minimum gain parameter value λ

_est;

c. using the estimated minimum gain parameter value λ

_estand a set of training data to train the neural network; and

d. projecting the training data onto the latent variable space to generate output clusters having cluster membership levels and cluster centers, with said cluster membership levels being determined as a function of proximity with respect to said cluster centers.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An method and apparatus for extracting an interpretable, meaningful, and concise rule set from neural networks is presented. The method involves adjustment of gain parameter, λ and the threshold, T_jfor the sigmoid activation function of the interactive-or operator used in the extraction/development of a rule set from an artificial neural network. A multi-stage procedure involving coarse and fine adjustment is used in order to constrain the range of the antecedents of the extracted rules to the range of values of the inputs to the artificial neural network. Furthermore, the consequents of the extracted rules are provided based on degree of membership such that they are easily understandable by human beings. The method disclosed may be applied to any pattern recognition task, and is particularly useful in applications such as vehicle occupant sensing and recognition, object recognition, gesture recognition, and facial pattern recognition, among others.

Citations

19 Claims

1. A method for interpretable rule extraction from neural networks comprising the steps of:
- a. providing a neural network having a latent variable space and an error rate, said neural network further including a sigmoid activation function having an adjustable gain parameter λ
  
  ;
  
  b. iteratively adjusting the adjustable gain parameter λ
  
  to minimize the error rate of the neural network, producing an estimated minimum gain parameter value λ
  
  _est;
  
  c. using the estimated minimum gain parameter value λ
  
  _estand a set of training data to train the neural network; and
  
  d. projecting the training data onto the latent variable space to generate output clusters having cluster membership levels and cluster centers, with said cluster membership levels being determined as a function of proximity with respect to said cluster centers.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
- - 2. A method for interpretable rule extraction from neural networks as set forth in claim 1, wherein:
3. A method for interpretable rule extraction from neural networks as set forth in claim 2, further including the step of fine-tuning the adjustable gain parameter λ
- by performing, after step b of claim 2, at least one repetition of the sub-steps of;
  
  i. setting the initial gain parameter value λ
  
  _initequal to the estimated minimum gain parameter value λ
  
  _estminus the gain incrementing value Δ
  
  λ
  
  from step b;
  
  ii. setting the final gain parameter value λ
  
  _final, equal to the estimated minimum gain parameter value λ
  
  _estplus the gain incrementing value Δ
  
  λ
  
  from step b;
  
  iii. generating a new gain incrementing value Δ
  
  λ
  
  , with the new gain incrementing value Δ
  
  λ
  
  being smaller than the previous gain incrementing value Δ
  
  λ
  
  ;
  
  iv. setting the current gain parameter value λ
  
  _currequal to the initial gain parameter value λ
  
  _init; and
  
  v. repeating sub-steps iv through ix of step b of claim 2;
  
  vi. using the value of the estimated minimum gain parameter value λ
  
  _estresulting from the step of fine-tuning the adjustable gain parameter λ
  
  in step c of claim 1 for training the neural network.
4. A method for interpretable rule extraction from neural networks as set forth in claim 1, wherein the neural network provided in step a of claim 1 further includes a plurality i of input nodes X_ifor receiving inputs having a plurality N input features and a plurality j of hidden layer nodes H_jwith each of the plurality j of hidden layer nodes H_jcorresponding to one of a plurality j of rules, with one of a plurality j of rules including a plurality of antecedents A, and the sigmoid activation function f(x) is of the form:
- $f (x) = \frac{1}{1 + e^{- λ W_{ij} X_{i}}},$ where λ
  
  represents the adjustable gain parameter;
  
  W_ijrepresents the weight between the plurality i of input nodes X_i, and a plurality j of hidden layer nodes H_j; and
  
  where each of the plurality of antecedents A of each one of the plurality j of rules is of the form;
  
  $A = \frac{2.2}{N λ_{est} W_{ij}},$ where N represents the input features of the inputs i;
  
  λ
  
  _estrepresents the estimated minimum gain parameter value; and
  
  W_ijrepresents the weight between the plurality i of input nodes X_i, and a plurality j of hidden layer nodes H_j.
5. A method for interpretable rule extraction from neural networks as set forth in claim 1, wherein the clusters and cluster membership levels generated in step d of claim 1 are provided with linguistic labels.
6. A method for interpretable rule extraction from neural networks as set forth in claim 3, wherein:
- a. the sigmoid activation function of the neural network provided in step a of claim 1 further includes an adjustable bias threshold T_j, b. between steps a and c of claim 1, is included the additional step of iteratively adjusting the adjustable bias threshold T_jto minimize the error rate of the neural network, producing an estimated minimum bias threshold T_j,est; and
  
  c. the estimated minimum bias threshold T_j,estis used along with the estimated minimum gain parameter value λ
  
  _estin step c of claim 1 to train the neural network.
7. A method for interpretable rule extraction from neural networks as set forth in claim 6, wherein the clusters and cluster membership levels generated in step d of claim 1 are provided with linguistic labels.
8. A method for interpretable rule extraction from neural networks as set forth in claim 6, wherein step b of claim 6 is further defined by the steps of:
- a. adjusting the adjustable bias threshold T_jby the sub-steps of;
  
  i. setting an initial bias threshold value T_j,init, a current bias parameter value T_j,curr, a final bias parameter value T_j,final, a bias incrementing value Δ
  
  T_j, and an estimated minimum bias parameter value T_j,est;
  
  ii. setting the current bias parameter value T_j,currequal to the initial bias threshold value T_j,init;
  
  iii. setting the estimated minimum bias parameter value T_j,estequal to the initial bias threshold value T_j,init;
  
  iv. training the neural network using the current bias parameter value T_j,currto provide a trained neural network;
  
  v. inputting the validation data set into the trained neural network to generate an output data set;
  
  vi. comparing the output data set generated by the trained neural network to the validation data set to determine the prediction error rate of the trained neural network;
  
  vii. resetting the current bias parameter value T_j,currequal to the current bias parameter value T_j,currplus the bias incrementing value Δ
  
  T_j;
  
  viii. after each repetition of sub-steps v through vii of step b of the present claim, setting the estimated minimum bias parameter value T_j,estequal to whichever of the current value of the estimated minimum bias parameter value T_j,estand the current bias parameter value T_j,currgenerated a lesser prediction error rate; and
  
  ix. repeating sub-steps iv through viii of the present claim until the current bias parameter value T_j,curris equal to the final bias parameter value T_j,final; and
  
  b. the estimated minimum bias threshold T_j,estused along with the estimated minimum gain parameter value λ
  
  _estin step c of claim 1 to train the neural network is that from sub-step viii of the present claim.
9. A method for interpretable rule extraction from neural networks as set forth in claim 8, further including the step of fine-tuning the adjustable bias threshold T_jby performing, after step a of claim 8, at least one repetition of the sub-steps of:
- a. setting the initial bias threshold value T_j,initequal to the estimated minimum bias parameter value T_j,estminus the bias incrementing value Δ
  
  T_jfrom step a of claim 8;
  
  b. setting the final bias parameter value T_j,final, equal to the estimated minimum bias parameter value T_j,estbias incrementing value Δ
  
  T_jfrom step a of claim 8;
  
  c. generating a new bias incrementing value Δ
  
  T_j, with the new bias incrementing value Δ
  
  T_jbeing smaller than the previous bias incrementing value Δ
  
  T_j;
  
  d. setting the current bias parameter value T_j,currequal to the initial bias threshold value T_j,init; and
  
  e. repeating sub-steps iv through viii of step a of claim 8;
  
  f. using the value of the estimated minimum bias parameter value T_j,estfrom step a of claim 8 along with the estimated minimum gain parameter value λ
  
  _estdeveloped in step c of claim 1 to train the neural network provided in step a of claim 1.
10. A method for interpretable rule extraction from neural networks as set forth in claim 8, wherein the neural network provided in step a of claim 1 further includes a plurality i of input nodes X_i, and a plurality j of hidden layer nodes H_jwith each of the plurality j of hidden layer nodes H_jcorresponding to one of a plurality j rules, with one of a plurality j rules including a plurality of antecedents A, and the sigmoid activation function f(x) is of the form:
- $f (x) = \frac{1}{1 + e^{- λ W_{ij} X_{i} + T_{j}}},$ where λ
  
  represents the adjustable gain parameter, W_ijrepresents the weight between the plurality i of input nodes X_i, and a plurality j of hidden layer nodes H_j; and
  
  where T_jrepresents the adjustable bias threshold; and
  
  where each of the plurality of antecedents A of each rule is of the form;
  
  $A = \frac{2.2 - T_{j, est}}{N λ_{est} W_{ij}},$ where T_j,estrepresents the adjustable bias threshold, where N represents the input features of the inputs;
  
  λ
  
  _estrepresents the estimated minimum gain parameter value λ
  
  _est; and
  
  W_ijrepresents the weight between the plurality i of input nodes X_i, and a plurality j of hidden layer nodes H_j.
11. A method for interpretable rule extraction from neural networks as set forth in claim 10, wherein the output clusters and cluster membership levels generated in step d of claim 1 are provided with linguistic labels.
12. A fuzzy rule set developed by the method of claim 1.
13. A fuzzy rule set developed by the method of claim 5.
14. A fuzzy rule set developed by the method of claim 6.
15. A fuzzy rule set developed by the method of claim 7.

16. An apparatus for interpretable rule extraction from neural networks comprising:
- a. a neural network having a latent variable space and an error rate, said neural network further including a sigmoid activation function having an adjustable gain parameter λ
  
  , with the gain parameter λ
  
  iteratively adjusted to minimize the error rate of the neural network, and to produce an estimated minimum gain parameter value λ
  
  _est;
  
  b. a set of training data used, along with the estimated minimum gain parameter value λ
  
  _est, to train the neural network; and
  
  c. output clusters generated by projection of the training data set onto the latent variable space of the neural network, each of said output clusters having cluster membership levels and cluster centers with the cluster membership levels determined as a function of proximity with respect to the cluster centers.
- View Dependent Claims (17, 18, 19)
- - 17. An apparatus for interpretable rule extraction from neural networks as set forth in claim 16, wherein the clusters and cluster membership levels are provided with linguistic labels.
  - 18. An apparatus for interpretable rule extraction from neural networks as set forth in claim 16, wherein the sigmoid activation function of the neural network further includes an adjustable bias threshold T_j, with the adjustable bias threshold T_jiteratively adjusted to minimize the error rate of the neural network, and to produce an estimated minimum bias threshold T_j,est, and wherein the training data set is used, along with the estimated minimum bias threshold T_j,estand the estimated minimum gain parameter value λ
    - _est, to train the neural network.
  - 19. An apparatus for interpretable rule extraction from neural networks as set forth in claim 18, wherein the output clusters and cluster membership levels are provided with linguistic labels.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
HRL Laboratories LLC (The Boeing Co.)
Original Assignee
HRL Laboratories LLC (The Boeing Co.)
Inventors
Narayan, Srinivasa, Owechko, Yuri
Primary Examiner(s)
Follansbee, John A.
Assistant Examiner(s)
Holmes, Michael B.

Application Number

US09/504,641
Time in Patent Office

1,182 Days
Field of Search

706/60, 706/16, 706/45, 706/23, 707/532, 707/1, 704/9
US Class Current

706/60
CPC Class Codes

G06F 18/24765   Rule-based classification

G06N 3/042   Knowledge-based neural netw...

G06N 7/046   Implementation by means of ...

G06V 20/59   inside of a vehicle, e.g. r...

G06V 40/103   Static body considered as a...

Fuzzy expert system for interpretable rule extraction from neural networks

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

19 Claims

Specification

Solutions

Use Cases

Quick Links

Fuzzy expert system for interpretable rule extraction from neural networks

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

19 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links