×

Method and apparatus for discovering association rules

  • US 6,385,608 B1
  • Filed: 11/05/1998
  • Issued: 05/07/2002
  • Est. Priority Date: 11/11/1997
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method for discovering an association rule existing between itemsets composed of one or more items, from a database storing a plurality of records composed of one or more items,where k is an integer greater than or equal to 2, and n indicates an integer from 1 to k, when an itemset is composed of n items and a frequency, meaning a number of records including the n items, of the itemset has not been checked, the itemset is defined as a candidate-itemset Cn, and when a frequency of the candidate-itemset Cn is greater than or equal to a lower limit value Smin, the candidate-itemset Cn is defined as a large-itemset Ln, the method comprising the steps of:

  • (a) generating a large-itemset, this step including the steps of;

    (a1) generating a large-itemset L1 by counting a frequency meaning a number of records including each item, and defining an itemset composed of items made of the each item having a frequency greater than or equal to than the lower limit value Smin as the large-itemset L1;

    (a2) generating a candidate-itemset Ck by using a large-itemset Lk−

    1 and the large-itemset L1; and

    (a3) generating a large-itemset Lk by selecting the large-itemset Lk from the candidate-itemset Ck; and

    (b) generating and testing a hypothesis, this step including the steps of;

    (b1) generating a candidate association rule by using the large-itemset Lk−

    1 and the large-itemset L1, where the large-itemset Lk−

    1 is defined as a condition itemset called a left hand side (LHS) and the large-itemset L1 is defined as a conclusion itemset called a right hand side (RHS); and

    (b2) testing whether or not the candidate association rule is to be applied as the association rule; and

    (c) inputting at least a significance level used for a chi-square test, and the step of testing the rule includes processes of calculating x2 statistics (chi-squared statistics) based on a frequency of LHS, a frequency of RHS, a frequency of BS (both right and left hand sides) and a total number of records, and performing the chi-square test in which the x2 statistics and the significance level are used as parameters.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×