Discretization of dimension attributes using data mining techniques
First Claim
1. A method for grouping members of a dimension for OLAP data, each of said members comprising a corresponding value for at least one dimension attribute, comprising:
- determining a distribution of said corresponding values;
using said distribution to divide said corresponding values into at least two groups; and
determining a specific group from among said at least two groups to assign for a given one of said members, where said group contains said corresponding value for said given member.
3 Assignments
0 Petitions
Accused Products
Abstract
In order to allow the use of data in dimension attributes for grouping members of a dimension, dimension attribute data is analyzed so it can be used as if it were data for a categorical attribute with a manageable number of states. The values possible for the dimension attribute are divided into groups. This is done by determining the distribution of data. An approximate distribution may be determined (by sampling some data) or an actual distribution may be determined (by sampling all data). The distribution is then used to determine the groups into which the range of data values will be divided. Each group is then treated as if it were a state for a categorical-type dimension attribute. A state can be determined for a member by determining which subrange contains the value for the dimension attribute for the member. The number of groups can be determined by a user or determined dynamically, e.g. to best fit the distribution found. The group data may be stored in order to allow further conversion of future cases.
30 Citations
32 Claims
-
1. A method for grouping members of a dimension for OLAP data, each of said members comprising a corresponding value for at least one dimension attribute, comprising:
-
determining a distribution of said corresponding values;
using said distribution to divide said corresponding values into at least two groups; and
determining a specific group from among said at least two groups to assign for a given one of said members, where said group contains said corresponding value for said given member. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer-readable medium having computer-executable instructions for grouping members of a dimension for OLAP data, each of said members comprising a corresponding value for at least one dimension attribute, said instructions for performing steps comprising:
-
determining a distribution of said corresponding values;
using said distribution to divide said corresponding values into at least two groups; and
determining a specific group from among said at least two groups to assign for a given one of said members, where said group contains said corresponding value for said given member. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
-
-
23. A data converter for grouping members of a dimension for OLAP data, each of said members comprising a corresponding value for at least one dimension attribute, comprising:
-
a distribution determiner for determining a distribution of said corresponding values;
a range divider for using said distribution to divide said corresponding values into at least two groups; and
a group assigner for determining a specific group from among said at least two groups to assign for a given one of said members, where said group contains said corresponding value for said given member - View Dependent Claims (24, 25, 26, 27, 28, 29, 30, 31, 32)
-
Specification