×

Method and apparatus for finding maximal frequent itemsets over data streams

  • US 8,150,873 B2
  • Filed: 10/27/2008
  • Issued: 04/03/2012
  • Est. Priority Date: 10/26/2007
  • Status: Active Grant
First Claim
Patent Images

1. A method for finding maximal frequent itemsets over data streams including continuously generated transactions, the method comprising:

  • when itemsets included in previously generated transactions and appearance frequencies of the itemsets are managed by a prefix tree and each of nodes of the prefix tree has information about an appearance frequency of an itemset corresponding to each node, a maximum lifetime which is a maximum point in time that allows the corresponding itemset to remain in a frequent state even if the corresponding itemset does not appear later, and a mark indicating whether the corresponding itemset is a maximal frequent itemset,(a) receiving transaction Tk generated at a current point in time;

    (b) updating the information owned by each node corresponding to the itemset of the transaction Tk among the nodes of the prefix tree;

    (c) adding each node, which is not managed in the prefix tree among nodes corresponding to the itemset of the transaction Tk, to the prefix tree, and setting the information on the added nodes; and

    (d) finding maximal frequent itemsets by visiting each node of the prefix tree which has the mark indicating the maximal frequent itemset, and checking whether the corresponding itemset is frequent;

    wherein the step (b) allows the mark owned by the node corresponding to the maximal frequent itemset, which belongs to subsets of the itemset of the transaction Tk, among the nodes of the prefix tree, to indicate the maximal frequent itemset only when a maximum error generated in a process of estimating a support of the maximal frequent itemset is within a maximum error threshold, andwherein the node corresponding to the maximal frequent itemset is a node whose maximum lifetime is updated from before the current point in time k to after the current point in time k or a node not having children nodes corresponding to the itemset of the transaction Tk, whose support according to the appearance frequency is equal to or more than a minimum support having a predetermined value.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×