Lightweight rule induction
First Claim
1. A method for generating decision rules in disjunctive normal form (DNF) by examining stored examples, also called cases, where a new example is classified by applying all rules and assigning the example to the class with the most satisfied rules, using a specified maximum length of a rule, maximum number of disjuncts and number of rules per class, said method comprising the steps of:
- (a) growing a conjunctive term T until the maximum length of a rule is reached or until false negatives (FN) is equal to zero (0) by greedily adding conditions that minimize Err1, wherein
1 Assignment
0 Petitions
Accused Products
Abstract
A lightweight rule induction method is described that generates compact Disjunctive Normal Form (DNF) rules. Each class may have an equal number of unweighted rules. A new example is classified by applying all rules and assigning the example to the class with the most satisfied rules. The induction method attempts to minimize the training error with no pruning. An overall design is specified by setting limits on the size and number of rules. During training, cases are adaptively weighted using a simple cumulative error method. The induction method is nearly linear in time relative to an increase in the number of induced rules or the number of cases. Experimental results on large benchmark datasets demonstrate that predictive performance can rival the best reported results in the literature.
39 Citations
15 Claims
-
1. A method for generating decision rules in disjunctive normal form (DNF) by examining stored examples, also called cases, where a new example is classified by applying all rules and assigning the example to the class with the most satisfied rules, using a specified maximum length of a rule, maximum number of disjuncts and number of rules per class, said method comprising the steps of:
(a) growing a conjunctive term T until the maximum length of a rule is reached or until false negatives (FN) is equal to zero (0) by greedily adding conditions that minimize Err1, wherein - View Dependent Claims (2, 3, 4, 5)
-
6. A computer readable medium containing code for generating decision rules in disjunctive normal form (DNF) by examining stored examples, also called cases, where a new example is classified by applying all rules and assigning the example to the class with the most satisfied rules, using a specified maximum length of a rule, maximum number of disjuncts and number of rules per class, the code, when executed on a computer, implementing the steps of:
(a) growing a conjunctive term T until the maximum length of a rule is reached or until false negatives (FN) is equal to zero (0) by greedily adding conditions that minimize Err1, wherein - View Dependent Claims (7, 8, 9, 10)
-
11. A system for generating decision rules in disjunctive normal form (DNF) by examining stored examples, also called cases, where a new example is classified by applying all rules and assigning the example to the class with the most satisfied rules, using a specified maximum length of a rule, maximum number of disjuncts and number of rules per class, comprising:
-
a computing device having memory for storing cases and at least one computer code section;
means for inputting a set of cases for developing decision rules;
a first code section in said memory implementing the steps for (a) growing a conjunctive term T until the maximum length of a rule is reached or until false negatives (FN) is equal to zero (0) by greedily adding conditions that minimize Err1, wherein
Err1=FP+k·
FN {where k=1,2,4 . . . and TP>
0},FP indicates a number of false positives and FN indicates a number of false negatives for a case, and TP indicates true positives, frq(i) represents a relative frequency of case i, FP(i) is 1 for a false positive in case i and 0 otherwise, and FN(i) is 1 for a false negative in case i and 0 otherwise;
(b) recording T, in said memory, as a next disjunct for rule R, thereby generating a part of induced rule R;
(c) removing cases, from said memory, covered by T, if a quantity of disjuncts generated is less than the specified maximum number of disjuncts and FN is greater than 0 and repeating steps (a)-(c);
(d) evaluating the induced rule R on all training cases i; and
(e) updating cumulative error counts e(i) for case i, wherein steps (a) through (e) are repeated for any desired number of rules; and
means for outputting the induced Rule R recorded in the recording step(b) implemented by the first code section. - View Dependent Claims (12, 13, 14, 15)
normalizing error Normk, where normalization for variable k is where frq(n) represents a relative frequency of n cases of all n cases, and frq(i) represents a relative frequency of all i cases without missing values; and
applying a rule to a new case, where the rule is satisfied if any single disjunct is satisfied, but a single disjunct is not satisfied when applied to a missing value.
-
-
13. A system as recited in claim 11, wherein the relative frequency frq(i) is {1+e(i)3}, and e(i) is a cumulative number of errors for case i.
-
14. A system recited in claim 11, further comprising a second code section implementing the step of pre-processing each example from a complete sample by sorting each variable to produce a paired ordered list of the form {ri,j}, where ri is the i-th smallest real value and j is the j-th case, thereby doubling the size of the original sample data and preparing data in sorted order to find a best threshold for testing a variable.
-
15. A system as recited in claim 11, wherein two key measures of rule size are considered:
- (a) conjunctive term length and (b) number of disjuncts.
Specification