System and method for temporal data mining
First Claim
Patent Images
1. A computer-readable storage medium tangibly embodying executable instructions for:
- receiving as input a temporal data series comprising events with start times and end times, a set of allowed dwelling times, a threshold frequency of occurrence, and an expiry time, wherein the expiry time provides a criterion for occurrence of an episode;
finding all frequent episodes of a particular length in the temporal data series having dwelling times, as determined by the start and end times, within the allowed dwelling times, wherein a frequency of an episode is defined by counting the number of occurrences and dividing by the length of temporal data series;
in successive passes through the temporal data series;
incrementing the particular length to generate an increased length;
combining frequent episodes to create combined episodes of the increased length;
creating a set of candidate episodes from the combined episodes by removing combined episodes which have non-frequent sub-episodes;
identifying one or more occurrences of a candidate episode in the temporal data series, wherein the identifying comprises tracking, with a plurality of automata, whether an occurrence of a candidate episode occurs in the temporal data series;
generating a plurality of automata configured to track a non-interleaved occurrence of a candidate episode and whether an occurrence of the candidate episode occurs in the temporal data series, wherein the non-interleaved occurrences include some overlapped occurrences but not all occurrences;
incrementing a count for each identified occurrence;
determining frequent episodes of the increased length;
setting the particular length to the increased length; and
producing an output for frequent episodes, wherein a frequent episode is an episode whose count of occurrences results in a frequency meeting or exceeding the threshold frequency of occurrence.
12 Assignments
0 Petitions
Accused Products
Abstract
A system for temporal data mining includes a computer readable medium having an application configured to receive at an input module a temporal data series and a threshold frequency. The system is further configured to identify, using a candidate identification and tracking module, one or more occurrences in the temporal data series of a candidate episode and increment a count for each identified occurrence. The system is also configured to produce at an output module an output for those episodes whose count of occurrences results in a frequency exceeding the threshold frequency.
-
Citations
9 Claims
-
1. A computer-readable storage medium tangibly embodying executable instructions for:
-
receiving as input a temporal data series comprising events with start times and end times, a set of allowed dwelling times, a threshold frequency of occurrence, and an expiry time, wherein the expiry time provides a criterion for occurrence of an episode; finding all frequent episodes of a particular length in the temporal data series having dwelling times, as determined by the start and end times, within the allowed dwelling times, wherein a frequency of an episode is defined by counting the number of occurrences and dividing by the length of temporal data series; in successive passes through the temporal data series; incrementing the particular length to generate an increased length; combining frequent episodes to create combined episodes of the increased length; creating a set of candidate episodes from the combined episodes by removing combined episodes which have non-frequent sub-episodes; identifying one or more occurrences of a candidate episode in the temporal data series, wherein the identifying comprises tracking, with a plurality of automata, whether an occurrence of a candidate episode occurs in the temporal data series; generating a plurality of automata configured to track a non-interleaved occurrence of a candidate episode and whether an occurrence of the candidate episode occurs in the temporal data series, wherein the non-interleaved occurrences include some overlapped occurrences but not all occurrences; incrementing a count for each identified occurrence; determining frequent episodes of the increased length; setting the particular length to the increased length; and producing an output for frequent episodes, wherein a frequent episode is an episode whose count of occurrences results in a frequency meeting or exceeding the threshold frequency of occurrence. - View Dependent Claims (2, 3)
-
-
4. A system for temporal data mining, comprising:
-
a computer readable storage medium that includes an application configured to; receive at an input module a temporal data series comprising events with start times and end times, a set of allowed dwelling times, and a threshold frequency of occurrence, wherein the input module receives an expiry time for providing a criterion for occurrence of an episode; identify, using a candidate identification and tracking module, one or more occurrences in the temporal data series of a candidate episode;
find all frequent episodes of a particular length in the temporal data series having dwelling times, asdetermined by the start and end times, within the allowed dwelling times, wherein a frequency of an episode is defined by counting the number of occurrences and dividing by the length of temporal data series; in successive passes through the temporal data series; increment the particular length to generate an increased length; combine frequent episodes to create combined episodes of the increased length; create a set of candidate episodes from the combined episodes by removing combined episodes which have non-frequent sub-episodes; identify one or more occurrences of a candidate episode in the temporal data series, wherein the identifying comprises tracking, with a plurality of automata, whether an occurrence of a candidate episode occurs in the temporal data series; generate, using an automata generation module, a plurality of automata configured to track a non-interleaved occurrence of a candidate episode and whether an occurrence of a candidate episode occurs in the temporal data series, wherein the non-interleaved occurrences include some overlapped occurrences but not all occurrences; increment a count for each identified occurrence; determine frequent episodes of the increased length; set the particular length to the increased length; and produce at an output module an output for those episodes whose count of occurrences results in a frequency meeting or exceeding the threshold frequency of occurrence. - View Dependent Claims (5, 6)
-
-
7. An apparatus for temporal data mining, comprising:
-
a processor for executing instructions; a memory device including instructions comprising; input instructions for receiving a temporal data series comprising events with start times and end times, a set of allowed dwelling times, a threshold frequency of occurrence, and an expiry time, wherein the expiry time provides a criterion for occurrence of an episode; candidate identification and tracking instructions for identifying one or more occurrences in the temporal data series of a candidate episode, finding all frequent episodes of a particular length in the temporal data series having dwelling times, as determined by the start and end times, within the allowed dwelling times, wherein a frequency of an episode is defined by counting the number of occurrences and dividing by the length of temporal data series; in successive passes through the temporal data series; increment the particular length to generate an increased length; combine frequent episodes to create combined episodes of the increased length; create a set of candidate episodes from the combined episodes by removing combined episodes which have non-frequent sub-episodes; identify one or more occurrences of a candidate episode in the temporal data series, wherein the identifying comprises tracking, with a plurality of automata, whether an occurrence of a candidate episode occurs in the temporal data series; instructions for generating a plurality of automata configured to track a non-interleaved of a candidate episode and whether an occurrence of a candidate episode occurs in the temporal data series, wherein the non-interleaved occurrences include some overlapped occurrences but not all occurrences; increment a count for each identified occurrence; determine frequent episodes of the increased length; set the particular length to the increased length; and output instructions for producing an output for those episodes whose count of occurrences results in a frequency meeting or exceeding the threshold frequency of occurrence. - View Dependent Claims (8, 9)
-
Specification