Systems and methods for discovering mutual dependence patterns
First Claim
1. A computer-based method of mining one or more patterns in an input data set of items, the method comprising the steps of:
- identifying sets of items in the input data set as mutual dependence patterns based on respective comparisons of conditional probability values associated with each of the sets of items to a predetermined mutual dependence threshold value; and
outputting the identified mutual dependence patterns based on results of the comparisons.
1 Assignment
0 Petitions
Accused Products
Abstract
A new form of pattern is provided, referred to as a mutual dependence pattern or m-pattern. The m-pattern captures mutual dependence among a set of items. Intuitively, the m-pattern represents a set of items that often occur together. In our experience, such m-patterns often provide great values for certain tasks, such as event correlation in event management. Further, an efficient algorithm is provided for discovering all m-patterns in data for a given minimum mutual dependence threshold. Specifically, a linear algorithm is provided for testing whether a pattern is an m-pattern. Further, a pruning algorithm is provided that prunes the search space effectively. Still further, a level-wise algorithm for mining m-patterns is provided.
-
Citations
36 Claims
-
1. A computer-based method of mining one or more patterns in an input data set of items, the method comprising the steps of:
-
identifying sets of items in the input data set as mutual dependence patterns based on respective comparisons of conditional probability values associated with each of the sets of items to a predetermined mutual dependence threshold value; and
outputting the identified mutual dependence patterns based on results of the comparisons.
-
-
2. A computer-based method of mining one or more patterns in an input data set of items, the method comprising the steps of:
-
identifying one or more sets of items in the input data set as one or more patterns based on respective comparisons of conditional probability values associated with each of the one or more sets of items to a predetermined threshold value; and
outputting the one or more identified patterns based on results of the comparisons;
wherein the identifying step further comprises identifying a set of items in the input data set, which includes at least two subsets of at least one item, as a pattern when the set of items has a conditional probability value computed therefor that is not less than a predetermined threshold value, wherein the conditional probability value is indicative of a probability that both of the at least two subsets of at least one item will occur given that one of the at least two subsets of at least one item has occurred. - View Dependent Claims (3, 4, 5, 6)
-
-
7. A computer-based method of mining one or more patterns in an input data set of items, the method comprising the steps of:
-
identifying one or more sets of items in the input data set as one or more patterns based on respective comparisons of conditional probability values associated with each of the one or more sets of items to a predetermined threshold value; and
outputting the one or more identified patterns based on results of the comparisons;
wherein the identifying step further comprises identifying a set of items in the input data set as a pattern when the set of items has a conditional probability value computed for the set of items minus a particular item of the set, given the particular item of the set, that is not less than a predetermined threshold value. - View Dependent Claims (8)
-
-
9. A computer-based method of mining one or more patterns in an input data set of items, the method comprising the steps of:
-
obtaining an input data set of items;
searching the input data set of items to identify sets of items in the input data set as mutual dependence patterns based on respective comparisons of conditional probability values associated with each of the sets of items to a predetermined mutual dependence threshold value; and
outputting the identified mutual dependence patterns based on results of the comparisons.
-
-
10. A computer-based method of mining one or more patterns in an input data set of items, the method comprising the steps of:
-
obtaining an input data set of items;
normalizing the input data set;
searching the input data set of items to identify one or more sets of items in the input data set as one or more patterns based on respective comparisons of conditional probability values associated with each of the one or more sets of items to a predetermined threshold value; and
outputting the one or more identified patterns based on results of the comparisons, where in the input data set comprises event data and the normalizing step comprises transforming at least a portion of the event data into event classes such that the event data is non-application-dependent. - View Dependent Claims (11, 12, 13, 14)
-
-
15. A computer-based method of mining one or more patterns in an input data set of items, the method comprising the steps of:
-
obtaining an input data set of items;
searching the input data set of items to identify one or more sets of items in the input data set as one or more patterns based on respective comparisons of conditional probability values associated with each of the one or more sets of items to a predetermined threshold value; and
outputting the one or more identified patterns based on results of the comparisons;
wherein the searching step further comprises the step of performing a level-wise scan based on a set length to determine candidate sets of items in the input data set that have conditional probability values respectively computed therefor that are not less than the predetermined threshold value. - View Dependent Claims (16, 17)
-
-
18. Apparatus for mining one or more patterns in an input data set of items, the apparatus comprising:
-
at least one processor operative to;
(i) identify sets of items in the input data set as mutual dependence patterns based on respective comparisons of conditional probability values associated with each of the sets of items to a predetermined mutual dependence threshold value; and
(ii) output the identified mutual dependence patterns based on results of the comparisons; and
a memory, coupled to the at least one processor, which stores at least one of the input data set and the identified mutual dependence patterns.
-
-
19. Apparatus for mining one or more patterns in an input data set of items, the apparatus comprising:
-
at least one processor operative to;
(i) identify one or more sets of items in the input data set as one or more patterns based on respective comparisons of conditional probability values associated with each of the one or more sets of items to a predetermined threshold value; and
(ii) output the one or more identified patterns based on results of the comparisons; and
a memory, coupled to the at least one processor, which stores at least one of the input data set and the one or more identified patterns;
wherein the identifying operation further comprises identifying a set of items in the input data set, which includes at least two subsets of at least one item, as a pattern when the set of items has a conditional probability value computed therefor that is not less than a predetermined threshold value, wherein the conditional probability value is indicative of a probability that both of the at least two subsets of at least one item will occur given that one of the at least two subsets of at least one item has occurred. - View Dependent Claims (20, 21, 22, 23)
-
-
24. Apparatus for mining one or more patterns in an input data set of items, the apparatus comprising:
-
at least one processor operative to;
(i) identify one or more sets of items in the input data set as one or more patterns based on respective comparisons of conditional probability values associated with each of the one or more sets of items to a predetermined threshold value; and
(ii) output the one or more identified patterns based on results of the comparisons; and
a memory, coupled to the at least one processor, which stores at least one of the input data set and the one or more identified patterns;
wherein the identifying operation further comprises identifying a set of items in the input data set as a pattern when the set of items has a conditional probability value computed for the set of items minus a particular item of the set, given the particular item of the set, that is not less than a predetermined threshold value. - View Dependent Claims (25)
-
-
26. Apparatus for mining one or more patterns in an input data set of items, the apparatus comprising:
-
at least one processor operative to;
(i) obtain an input data set of items;
(ii) search the input data set of items to identify sets of items in the input data set as mutual dependence patterns based on respective comparisons of conditional probability values associated with each of the sets of items to a predetermined mutual dependence threshold value; and
(iii) output the identified mutual dependence patterns based on results of the comparisons; and
a memory, coupled to the at least one processor, which stores at least one of the input data set and the identified mutual dependence patterns.
-
-
27. Apparatus for mining one or more patterns in an input data set of items, the apparatus comprising:
-
at least one processor operative to;
(i) obtain an input data set of items;
(iii) search the input data set of items to identify one or more sets of items in the input data set as one or more patterns based on respective comparisons of conditional probability values associated with each of the one or more sets of items to a predetermined threshold value; and
(iii) output the one or more identified patterns based on results of the comparisons; and
a memory, coupled to the at least one processor, which stores at least one of the input data set and the one or more identified patterns;
wherein the at least one processor is further operative to, prior to the searching operation, normalize the input data set, and further wherein the input data set comprises event data and the normalizing operation comprises transforming at least a portion of the event data into event classes such that the event data is non-application-dependent. - View Dependent Claims (28, 29, 30, 31)
-
-
32. Apparatus for mining one or more patterns in an input data set of items, the apparatus comprising:
-
at least one processor operative to;
(i) obtain an input data set of items;
(ii) search the input data set of items to identify one or more sets of items in the input data set as one or more patterns based on respective comparisons of conditional probability values associated with each of the one or more sets of items to a predetermined threshold value; and
(iii) output the one or more identified patterns based on results of the comparisons; and
a memory, coupled to the at least one processor, which stores at least one of the input data set and the one or more identified patterns;
wherein the searching operation further comprises performing a level-wise scan based on a set length to determine candidate sets of items in the input data set that have conditional probability values respectively computed therefor that are not less than the predetermined threshold value. - View Dependent Claims (33, 34)
-
-
35. An article of manufacture for mining one or more patterns in an input data set of items, the article comprising a machine readable medium containing one or more programs which when executed implement the steps of:
-
identifying sets of items in the input data set as mutual dependence patterns based on respective comparisons of conditional probability values associated with each of the sets of items to a predetermined mutual dependence threshold value; and
outputting the identified mutual dependence patterns based on results of the comparisons.
-
-
36. An article of manufacture for mining one or more patterns in an input data set of items, the article comprising a machine readable medium containing one or more programs which when executed implement the steps of:
-
obtaining an input data set of items;
searching the input data set of items to identify sets of items in the input data set as mutual dependence patterns based on respective comparisons of conditional probability values associated with each of the sets of items to a predetermined mutual dependence threshold value; and
outputting the identified mutual dependence patterns based on results of the comparisons.
-
Specification