Lightweight rule induction

US 6,523,020 B1
Filed: 03/22/2000
Issued: 02/18/2003
Est. Priority Date: 03/22/2000
Status: Expired due to Fees

First Claim

Patent Images

1. A method for generating decision rules in disjunctive normal form (DNF) by examining stored examples, also called cases, where a new example is classified by applying all rules and assigning the example to the class with the most satisfied rules, using a specified maximum length of a rule, maximum number of disjuncts and number of rules per class, said method comprising the steps of:

(a) growing a conjunctive term T until the maximum length of a rule is reached or until false negatives (FN) is equal to zero (0) by greedily adding conditions that minimize Err1, wherein

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A lightweight rule induction method is described that generates compact Disjunctive Normal Form (DNF) rules. Each class may have an equal number of unweighted rules. A new example is classified by applying all rules and assigning the example to the class with the most satisfied rules. The induction method attempts to minimize the training error with no pruning. An overall design is specified by setting limits on the size and number of rules. During training, cases are adaptively weighted using a simple cumulative error method. The induction method is nearly linear in time relative to an increase in the number of induced rules or the number of cases. Experimental results on large benchmark datasets demonstrate that predictive performance can rival the best reported results in the literature.

39 Citations

View as Search Results

15 Claims

1. A method for generating decision rules in disjunctive normal form (DNF) by examining stored examples, also called cases, where a new example is classified by applying all rules and assigning the example to the class with the most satisfied rules, using a specified maximum length of a rule, maximum number of disjuncts and number of rules per class, said method comprising the steps of:
- (a) growing a conjunctive term T until the maximum length of a rule is reached or until false negatives (FN) is equal to zero (0) by greedily adding conditions that minimize Err1, wherein
- View Dependent Claims (2, 3, 4, 5)
- - 2. A method as recited in claim 1, wherein rules may be induced from data with missing values by skipping a case i having a missing value, further comprising the steps of:
    - normalizing error Norm_k, where normalization for variable k is ${Norm}_{k} = \frac{\sum_{n, all} frq (n)}{\sum_{i, w / o missing vals} frq (i)}, {FP}_{k} = {Norm}_{k} \cdot \sum_{i} FP (i) \cdot frq (i), and$ ${FN}_{k} = {Norm}_{k} \cdot \sum_{i} FN (i) \cdot frq (i),$ where frq(n) represents a relative frequency of n cases of all n cases, and frq(i) represents a relative frequency of all i cases without missing values; and
      
      applying a rule to a new case, where the rule is satisfied if any single disjunct is satisfied, but a single disjunct is not satisfied when applied to a missing value.
  - 3. A method as recited in claim 1, wherein the relative frequencyfrq(i) is {1=e(i)³}, and e(i) is a cumulative number of errors for case i.
  - 4. A method as recited in claim 1, further comprising the step of pre-processing each example from a complete sample by sorting each variable to produce a paired ordered list of the form {r_i,j}, where r_iis the i-th smallest real value and j is the j-th case, thereby doubling the size of the original sample data and preparing data in sorted order to find a best threshold for testing a variable.
  - 5. A method as recite in claim 1, wherein two key measures of rule size are considered:
    - (a) conjunctive term length and (b) number of disjuncts.

6. A computer readable medium containing code for generating decision rules in disjunctive normal form (DNF) by examining stored examples, also called cases, where a new example is classified by applying all rules and assigning the example to the class with the most satisfied rules, using a specified maximum length of a rule, maximum number of disjuncts and number of rules per class, the code, when executed on a computer, implementing the steps of:
- (a) growing a conjunctive term T until the maximum length of a rule is reached or until false negatives (FN) is equal to zero (0) by greedily adding conditions that minimize Err1, wherein
- View Dependent Claims (7, 8, 9, 10)
- - 7. A computer readable medium as recited in claim 6, wherein rules may be induced from data with missing values by skipping a case i having a missing value, the code further implementing the steps of:
    - normalizing error Norm_k, where normalization for variable k is ${Norm}_{k} = \frac{\sum_{n, all} frq (n)}{\sum_{i, w / o missing vals} frq (i)}, {FP}_{k} = {Norm}_{k} \cdot \sum_{i} FP (i) \cdot frq (i), and$ ${FN}_{k} = {Norm}_{k} \cdot \sum_{i} FN (i) \cdot frq (i),$ where frq(n) represents a relative frequency of n cases of all n cases, and frq(i) represents a relative frequency of all i cases without missing values; and
      
      applying a rule to a new case, where the rule is satisfied if any single disjunct is satisfied, but a single disjunct is not satisfied when applied to a missing value.
  - 8. A computer readable medium as recited in claim 6, wherein the relative frequency frq(i) is {1+e(i)³}, and e(i) is a cumulative number of errors for case i.
  - 9. A computer readable medium as recited in claim 6, the code further implementing the step of pre-processing each example from a complete sample by sorting each variable to produce a paired ordered list of the form {r_i,j}, where r_iis the i-th smallest real value and j is thej-th case, thereby doubling the size of the original sample data and preparing data in sorted order to find a best threshold for testing a variable.
  - 10. A computer readable medium as recited in claim 6, wherein two key measures of rule size are considered:
    - (a) conjunctive term length and (b) number of disjuncts.

11. A system for generating decision rules in disjunctive normal form (DNF) by examining stored examples, also called cases, where a new example is classified by applying all rules and assigning the example to the class with the most satisfied rules, using a specified maximum length of a rule, maximum number of disjuncts and number of rules per class, comprising:
- a computing device having memory for storing cases and at least one computer code section;
  
  means for inputting a set of cases for developing decision rules;
  
  a first code section in said memory implementing the steps for (a) growing a conjunctive term T until the maximum length of a rule is reached or until false negatives (FN) is equal to zero (0) by greedily adding conditions that minimize Err1, wherein
  
  Err1=FP+k·
  
  FN {where k=1,2,4 . . . and TP>
  
  0},FP indicates a number of false positives and FN indicates a number of false negatives for a case, and TP indicates true positives, $FP = \sum_{i} FP (i) \cdot frq (i), FN = \sum_{i} FN (i) \cdot frq (i),$ frq(i) represents a relative frequency of case i, FP(i) is 1 for a false positive in case i and 0 otherwise, and FN(i) is 1 for a false negative in case i and 0 otherwise;
  
  (b) recording T, in said memory, as a next disjunct for rule R, thereby generating a part of induced rule R;
  
  (c) removing cases, from said memory, covered by T, if a quantity of disjuncts generated is less than the specified maximum number of disjuncts and FN is greater than 0 and repeating steps (a)-(c);
  
  (d) evaluating the induced rule R on all training cases i; and
  
  (e) updating cumulative error counts e(i) for case i, wherein steps (a) through (e) are repeated for any desired number of rules; and
  
  means for outputting the induced Rule R recorded in the recording step (b) implemented by the first code section.
- View Dependent Claims (12, 13, 14, 15)
- - 12. A system as recited in claim 11, wherein rules may be induced from data with missing values by skipping a case i having a missing value, further comprising a second code section implementing the steps of:
13. A system as recited in claim 11, wherein the relative frequency frq(i) is {1+e(i)³}, and e(i) is a cumulative number of errors for case i.
14. A system recited in claim 11, further comprising a second code section implementing the step of pre-processing each example from a complete sample by sorting each variable to produce a paired ordered list of the form {r_i,j}, where r_iis the i-th smallest real value and j is the j-th case, thereby doubling the size of the original sample data and preparing data in sorted order to find a best threshold for testing a variable.
15. A system as recited in claim 11, wherein two key measures of rule size are considered:
- (a) conjunctive term length and (b) number of disjuncts.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
International Business Machines Corporation
Original Assignee
International Business Machines Corporation
Inventors
Weiss, Sholom
Primary Examiner(s)
Black, Thomas
Assistant Examiner(s)
Booker, Kelvin

Application Number

US09/533,323
Time in Patent Office

1,063 Days
Field of Search

706/47, 707/2, 707/3, 707/5
US Class Current

706/47
CPC Class Codes

G06N 5/025 Extracting rules from data

Y10S 707/99935 Query augmenting and refini...

Lightweight rule induction

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

39 Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

Lightweight rule induction

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

39 Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links