System and method for discovering calendric association rules
First Claim
1. A method for identifying calendric association rules in transactions with time-stamped data items, comprising the steps of:
- identifying a plurality of time intervals as constituting a calendar of interest;
identifying large itemsets in each time interval, where a large itemset is an itemset that occurs in the transactions at least as frequently as a preselected support threshold;
identifying association rules from said large itemsets by determining if said preselected support and a confidence threshold has been satisfied; and
generating calendric association rules by examining identified association rules to determine which ones exhibit temporal patterns as specified by a given calendar.
8 Assignments
0 Petitions
Accused Products
Abstract
A system and method for determining calendric association rules are provided. The method uses calendars to describe the variation of association rules over time, where a specific calendar is defined as a collection of time intervals describing some phenomenon. In accordance with the invention, there is provided a method for identifying calendric association rules in transactional data with time stamped data items. In one exemplary embodiment, the method identifies large itemsets in each time unit, where a large itemset is an itemset that occurs in the transactions more than a given threshold. The method then identifies association rules of the form X—Y from the large itemsets by determining if a requisite support for the itemset XY and a given confidence threshold (ratio of (support of XY)/(support of X)) has been satisfied. Calendric association rules are then generated by examining identified association rules to determine which ones exhibit the temporal patterns specified by given calendars. In another embodiment, the method identifies large itemsets in each time unit, where an itemset includes at least one item type. The method then identifies calendars that belong to the large itemsets. Potential calendars for increasingly larger item type itemsets are generated by using previously identified calendars. Support values are calculated to determine which potential calendars actually belong to the itemsets and this is then used to determine what potential calendar association rules exist. The potential calendar association rule information and support values are used to determine which potential calendars actually belong to association rules.
37 Citations
25 Claims
-
1. A method for identifying calendric association rules in transactions with time-stamped data items, comprising the steps of:
-
identifying a plurality of time intervals as constituting a calendar of interest;
identifying large itemsets in each time interval, where a large itemset is an itemset that occurs in the transactions at least as frequently as a preselected support threshold;
identifying association rules from said large itemsets by determining if said preselected support and a confidence threshold has been satisfied; and
generating calendric association rules by examining identified association rules to determine which ones exhibit temporal patterns as specified by a given calendar. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
using said large itemsets and said preselected support to ascertain which calendars actually belong to said large itemsets;
using ascertained calendars to determine what potential calendar association rules exist; and
using potential calendar association rule information and computed support values to determine which of said calendars actually belong to association rules.
-
-
11. A method for identifying calendric association rules in transactions with time-stamped data items, comprising the steps of:
-
identifying a plurality of time intervals as constituting a calendar of interest;
identifying large itemsets in each time interval, where a large itemset is an itemset that occurs in the transactions at least as frequently as a preselected support threshold and an itemset includes at least one item type;
identifying calendars that belong to said large itemsets;
generating potential calendars for itemsets including additional item types by using previously identified itemsets and their calendars;
computing support for said itemsets in each time unit to determine which of said potential calendars actually belong to said itemsets;
using said itemsets and associated potential calendars to determine what potential calendric association rules exist; and
using potential calendar association rule information and computed support values to determine which of said potential calendars actually belong to association rules. - View Dependent Claims (12, 13, 14, 15)
-
-
16. A system for identifying calendric association rules from transaction data having time-stamped items, comprising:
-
a computing device including memory for storing said transaction data;
said computing device operable to identify a plurality of time intervals as constituting a calendar of interest;
said computing device further operable to identify large itemsets in each time unit, where a large itemset is an itemset that occurs in the transactions at least as frequently as a preselected threshold;
said computing device further operable to identify association rules from said large itemsets by determining if a preselected support and a confidence threshold has been satisfied; and
said computing device further operable to generate calendric association rules by examining identified association rules to determine which ones exhibit temporal patterns as specified by given calendars. - View Dependent Claims (17, 18, 19, 20, 21)
said association rules are of the form X→
Y;
said preselected support is for a XY itemset; and
said confidence threshold is a ratio of support of Itemset XY over support of Itemset X.
-
-
18. The system according to claim 16, further including an interface for identifying said given calendars from a plurality of predefined calendars.
-
19. The system according to claim 18, wherein said interface is further operable to input a value for said confidence threshold and said preselected threshold.
-
20. The system according to claim 16, further including an interface for using calendar algebra to define interesting calendars as said given calendars.
-
21. The system according to claim 16, wherein:
-
said computing device is further operable to use said large itemsets and said preselected support to ascertain which calendars actually belong to said large itemsets;
said computing device is further operable to use ascertained calendars to determine what potential calendar association rules exist; and
said computing device is further operable to use potential calendar association rule information and computed support values to determine which of said calendars actually belong to association rules.
-
-
22. A system for identifying calendric association rules in transactional data with time-stamped data items, comprising:
-
a processor having a memory, said memory storing said transactional data;
said processor operable to identify a plurality of time intervals as constituting a calendar of interest;
said processor further operable to identify large itemsets in each time unit, where a large itemset is an itemset that occurs in the transactions at least as frequently as a preselected support threshold and an itemset includes at least one item type;
said processor further operable to identify calendars that belong to said large itemsets;
said processor further operable to generate potential calendars for increasingly larger item type itemsets by using previously identified calendars;
said processor further operable to compute support for said itemsets in each time unit to determine which of said potential calendars actually belong to said itemsets;
said processor further operable to use said itemsets and associated potential calendars to determine what potential calendar association rules exist; and
said processor further operable to use potential calendric association rule information and computed support values to determine which of said potential calendars actually belong to association rules. - View Dependent Claims (23, 24, 25)
-
Specification