Determination of Rules by Providing Data Records in Columnar Data Structures
First Claim
1. A computer-implemented method for determining first rules, wherein each first rule comprises a source attribute-value pair and a destination attribute-value pair, the method comprising:
- providing a columnar database, the columnar database comprising a plurality of columnar data structures, each columnar data structure being associated with one column attribute and comprising one or more column entries;
providing first data records, the first data records being stored in the columnar database, each first data record having a plurality of first attribute-value pairs, wherein each value of said first attribute-value pairs is stored in one of the columnar data structures associated with the respective column attribute, wherein each column entry is associated with one value of the respective column attribute and comprises counting information, the counting information being indicative of a number of first data records having the respective first attribute-value pair;
providing mask data structures, each mask data structure having the same structure as one of the columnar data structures, the mask data structures comprising one or more second attribute-value pairs;
selecting second data records as a sub-set of the first data records by intersecting the columnar data structures and the mask data structures, the second data records selectively comprising first data records which comprise at least one first attribute-value pair matching one of the one or more second attribute-value pairs;
selecting one of the column attributes and one value being contained in the column data structure associated with said selected column attribute as the destination attribute-value pair;
creating one second rule for each first attribute-value pair of the second data records, wherein said first attribute-value pair is used as source attribute-value of said second rule and wherein the selected destination attribute-value pair is used as the destination attribute-value pair of said second rule;
calculating, for each second rule, a co-occurrence-count between its respective source attribute-value pair and its destination attribute-value pair; and
specifically selecting one or more of said second rules as the first rules in dependence on the calculated co-occurrence-count.
1 Assignment
0 Petitions
Accused Products
Abstract
A method includes providing a columnar database comprising a plurality of columnar data structures associated with one column attribute; providing first data records having a plurality of first attribute-value pairs comprising counting information indicative of a number of first data records having the respective first attribute-value pair; providing mask data structures comprising one or more second attribute-value pairs; selecting second data records by intersecting the columnar data structures and the mask data structures; selecting one of the column attributes and one value contained in the column data structure associated with said selected column attribute as the destination attribute-value pair; creating one second rule for each first attribute-value pair; calculating, for each second rule, a co-occurrence-count between its respective source attribute-value pair and its destination attribute-value pair; and specifically selecting one or more of said second rules as the first rules in dependence on the calculated co-occurrence-count.
-
Citations
20 Claims
-
1. A computer-implemented method for determining first rules, wherein each first rule comprises a source attribute-value pair and a destination attribute-value pair, the method comprising:
-
providing a columnar database, the columnar database comprising a plurality of columnar data structures, each columnar data structure being associated with one column attribute and comprising one or more column entries; providing first data records, the first data records being stored in the columnar database, each first data record having a plurality of first attribute-value pairs, wherein each value of said first attribute-value pairs is stored in one of the columnar data structures associated with the respective column attribute, wherein each column entry is associated with one value of the respective column attribute and comprises counting information, the counting information being indicative of a number of first data records having the respective first attribute-value pair; providing mask data structures, each mask data structure having the same structure as one of the columnar data structures, the mask data structures comprising one or more second attribute-value pairs; selecting second data records as a sub-set of the first data records by intersecting the columnar data structures and the mask data structures, the second data records selectively comprising first data records which comprise at least one first attribute-value pair matching one of the one or more second attribute-value pairs; selecting one of the column attributes and one value being contained in the column data structure associated with said selected column attribute as the destination attribute-value pair; creating one second rule for each first attribute-value pair of the second data records, wherein said first attribute-value pair is used as source attribute-value of said second rule and wherein the selected destination attribute-value pair is used as the destination attribute-value pair of said second rule; calculating, for each second rule, a co-occurrence-count between its respective source attribute-value pair and its destination attribute-value pair; and specifically selecting one or more of said second rules as the first rules in dependence on the calculated co-occurrence-count. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A computer program product for determining first rules, wherein each first rule comprises a source attribute-value pair and a destination attribute-value pair, the computer program product comprising:
a computer readable storage medium having computer readable program code embodied therewith, the computer readable program code being configured for; providing a columnar database, the columnar database comprising a plurality of columnar data structures, each columnar data structure being associated with one column attribute and comprising one or more column entries; providing first data records, the first data records being stored in the columnar database, each first data record having a plurality of first attribute-value pairs, wherein each value of said first attribute-value pairs is stored in one of the columnar data structures associated with the respective column attribute, wherein each column entry is associated with one value of the respective column attribute and comprises counting information, the counting information being indicative of a number of first data records having the respective first attribute-value pair; providing mask data structures, each mask data structure having the same structure as one of the columnar data structures, the mask data structures comprising one or more second attribute-value pairs; selecting second data records as a sub-set of the first data records by intersecting the columnar data structures and the mask data structures, the second data records selectively comprising first data records which comprise at least one first attribute-value pair matching one of the one or more second attribute-value pairs; selecting one of the column attributes and one value being contained in the column data structure associated with said selected column attribute as the destination attribute-value pair; creating one second rule for each first attribute-value pair of the second data records, wherein said first attribute-value pair is used as source attribute-value of said second rule and wherein the selected destination attribute-value pair is used as the destination attribute-value pair of said second rule; calculating;
for each second rule, a co-occurrence-count between its respective source attribute-value pair and its destination attribute-value pair; andspecifically selecting one or more of said second rules as the first rules in dependence on the calculated co-occurrence-count. - View Dependent Claims (15, 16, 17, 18, 19)
-
20. A data processing system being operatively coupled to a columnar database, the data processing system comprising a processor, the system configured to perform a method comprising:
-
providing a columnar database, the columnar database comprising a plurality of columnar data structures, each columnar data structure being associated with one column attribute and comprising one or more column entries; providing first data records, the first data records being stored in the columnar database, each first data record having a plurality of first attribute-value pairs, wherein each value of said first attribute-value pairs is stored in one of the columnar data structures associated with the respective column attribute, wherein each column entry is associated with one value of the respective column attribute and comprises counting information, the counting information being indicative of a number of first data records having the respective first attribute-value pair; providing mask data structures, each mask data structure having the same structure as one of the columnar data structures, the mask data structures comprising one or more second attribute-value pairs; selecting second data records as a sub-set of the first data records by intersecting the columnar data structures and the mask data structures, the second data records selectively comprising first data records which comprise at least one first attribute-value pair matching one of the one or more second attribute-value pairs; selecting one of the column attributes and one value being contained in the column data structure associated with said selected column attribute as the destination attribute-value pair; creating one second rule for each first attribute-value pair of the second data records, wherein said first attribute-value pair is used as source attribute-value of said second rule and wherein the selected destination attribute-value pair is used as the destination attribute-value pair of said second rule; calculating, for each second rule, a co-occurrence-count between its respective source attribute-value pair and its destination attribute-value pair; and specifically selecting one or more of said second rules as the first rules in dependence on the calculated co-occurrence-count.
-
Specification