CUBE-BASED PERCENTILE CALCULATION
First Claim
1. One or more computer-readable media comprising computer-executable instructions for determining a percentile value of a set of data entries, the computer-executable instructions directed to steps comprising:
- dividing the data entries into buckets such that a first data entry, having a first dimension value, is divided into a first bucket, having a first bucket lower bound and a first bucket upper bound, both along a first dimension, if the first dimension value is between the first bucket lower bound and the first bucket upper bound;
defining multiple data collections continuously arranged along the first dimension such that a first data collection has a first data collection lower bound and a first data collection upper bound, both along the first dimension;
associating the first bucket with the first data collection if the first bucket lower bound and the first bucket upper bound are both between the first data collection lower bound and the first data collection upper bound;
counting the data entries divided into the buckets;
aggregating, for the multiple data collections, the counted number of data entries for buckets associated with the multiple data collections such that, if the first bucket is associated with the first data collection, then the counted number of data entries divided into the first bucket is also counted for the first data collection;
identifying, based in part on the aggregated number of data entries for the multiple data collections, a total number of data entries in the set of data entries, and a requested percentile, a relevant data collection comprising a relevant data entry, the relevant data entry associated with the requested percentile; and
determining the percentile value to be one or more values between a relevant data collection lower bound and a relevant data collection upper bound, both along the first dimension.
2 Assignments
0 Petitions
Accused Products
Abstract
By dividing data entries among multiple data collections, a data collection comprising a data entry associated with a requested percentile can be determined with reference to the number of data entries within each collection. Initially, the range of values corresponding to the identified data collection can be presented as the value of the requested percentile. Should further detail be required, the value for a requested percentile can be refined by averaging the range, linearly, or otherwise, extrapolating estimated values for the data entries within the identified data collection, or sorting the actual entries according to their values. A relational database can maintain the data entries, each comprising values along one or more dimensions, and an OLAP engine component can maintain a counting of the data entries within the defined data collections.
14 Citations
20 Claims
-
1. One or more computer-readable media comprising computer-executable instructions for determining a percentile value of a set of data entries, the computer-executable instructions directed to steps comprising:
-
dividing the data entries into buckets such that a first data entry, having a first dimension value, is divided into a first bucket, having a first bucket lower bound and a first bucket upper bound, both along a first dimension, if the first dimension value is between the first bucket lower bound and the first bucket upper bound; defining multiple data collections continuously arranged along the first dimension such that a first data collection has a first data collection lower bound and a first data collection upper bound, both along the first dimension; associating the first bucket with the first data collection if the first bucket lower bound and the first bucket upper bound are both between the first data collection lower bound and the first data collection upper bound; counting the data entries divided into the buckets; aggregating, for the multiple data collections, the counted number of data entries for buckets associated with the multiple data collections such that, if the first bucket is associated with the first data collection, then the counted number of data entries divided into the first bucket is also counted for the first data collection; identifying, based in part on the aggregated number of data entries for the multiple data collections, a total number of data entries in the set of data entries, and a requested percentile, a relevant data collection comprising a relevant data entry, the relevant data entry associated with the requested percentile; and determining the percentile value to be one or more values between a relevant data collection lower bound and a relevant data collection upper bound, both along the first dimension. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. One or more computer-readable media comprising computer-executable instructions providing for the determination of a percentile value of a set of data entries comprising values along multiple dimensions, the computer-executable instructions directed to steps comprising:
-
providing an interface for requesting a percentile directed to values along a first dimension, the interface enabling selection of one or more bounds along dimensions other than the first dimension; counting data entries associated with one or more multi-dimensional data collections continuously arranged along each of the multiple dimensions; identifying, based in part on the counted number of data entries for the one or more multi-dimensional data collections, a total number of data entries in the set of data entries, and a requested percentile, a relevant multi-dimensional data collection comprising a relevant data entry associated with the requested percentile; and determining the percentile value to be one or more values between a lower bound, along the first dimension, of the relevant multi-dimensional data collection and a upper bound, along the first dimension, of the relevant multi-dimensional data collection. - View Dependent Claims (10, 11, 12)
-
-
13. A method for determining a percentile value of a set of data entries comprising the steps of:
-
dividing the data entries into buckets such that a first data entry, having a first dimension value, is divided into a first bucket, having a first bucket lower bound and a first bucket upper bound, both along a first dimension, if the first dimension value is between the first bucket lower bound and the first bucket upper bound; defining multiple data collections continuously arranged along the first dimension such that a first data collection has a first data collection lower bound and a first data collection upper bound, both along the first dimension; associating the first bucket with the first data collection if the first bucket lower bound and the first bucket upper bound are both between the first data collection lower bound and the first data collection upper bound; counting the data entries divided into the buckets; aggregating, for the multiple data collections, the counted number of data entries for buckets associated with the multiple data collections such that, if the first bucket is associated with the first data collection, then the counted number of data entries divided into the first bucket is also counted for the first data collection; identifying, based in part on the aggregated number of data entries for the multiple data collections, a total number of data entries in the set of data entries, and a requested percentile, a relevant data collection comprising a relevant data entry, the relevant data entry associated with the requested percentile; and determining the percentile value to be one or more values between a relevant data collection lower bound and a relevant data collection upper bound, both along the first dimension. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
Specification