Dynamic data stream histograms for large ranges
First Claim
1. A system for creating a histogram from a data set comprising a plurality of data elements, comprising:
- a processor;
a plurality of internal buckets, wherein each internal bucket of the plurality of internal buckets represent values between an internal minimum value and an internal maximum value, wherein a plurality of differences of the internal minimum value and the internal maximum value of each internal bucket are heterogeneous; and
a histogram engine configured to be executed by the processor to;
populate the plurality of internal buckets with the plurality of data elements based on the internal minimum value and the internal maximum value of each internal bucket to obtain a plurality of populated internal buckets; and
create the histogram comprising a plurality of external buckets, wherein each of the plurality of external buckets comprises a total sum of at least two of the plurality of populated internal buckets,wherein creating the histogram comprises;
identifying a minimum value and a maximum value for the plurality of data elements,calculating a range according to the minimum value and the maximum value,dividing the range into the plurality of external buckets,for each of the plurality of external buckets;
calculating a staffing internal bucket of the plurality of internal buckets and an ending internal bucket of the plurality of internal buckets for each of a plurality of external buckets using the range,summing each value of a subset of the plurality of internal buckets between the staffing internal bucket and ending internal bucket to create a total sum, andoutputting the total sum,wherein calculating the staffing internal bucket and the ending internal bucket is based on a number of the plurality of external buckets and the range,wherein the number of the plurality of external buckets is determined from a parameter in a command to create the histogram.
2 Assignments
0 Petitions
Accused Products
Abstract
A method for creating a histogram from a plurality of data elements that includes specifying a plurality of internal buckets, wherein each internal bucket of the plurality of internal buckets represent values between an internal minimum value and an internal maximum value, wherein a plurality of differences of the internal minimum value and the internal maximum value of each internal bucket are heterogeneous, populating the plurality of internal buckets with the plurality of data elements based on the internal minimum value and the internal maximum value of each internal bucket to obtain a plurality of populated internal buckets, and outputting the histogram from the plurality of populated internal buckets.
29 Citations
10 Claims
-
1. A system for creating a histogram from a data set comprising a plurality of data elements, comprising:
-
a processor; a plurality of internal buckets, wherein each internal bucket of the plurality of internal buckets represent values between an internal minimum value and an internal maximum value, wherein a plurality of differences of the internal minimum value and the internal maximum value of each internal bucket are heterogeneous; and a histogram engine configured to be executed by the processor to; populate the plurality of internal buckets with the plurality of data elements based on the internal minimum value and the internal maximum value of each internal bucket to obtain a plurality of populated internal buckets; and create the histogram comprising a plurality of external buckets, wherein each of the plurality of external buckets comprises a total sum of at least two of the plurality of populated internal buckets, wherein creating the histogram comprises; identifying a minimum value and a maximum value for the plurality of data elements, calculating a range according to the minimum value and the maximum value, dividing the range into the plurality of external buckets, for each of the plurality of external buckets; calculating a staffing internal bucket of the plurality of internal buckets and an ending internal bucket of the plurality of internal buckets for each of a plurality of external buckets using the range, summing each value of a subset of the plurality of internal buckets between the staffing internal bucket and ending internal bucket to create a total sum, and outputting the total sum, wherein calculating the staffing internal bucket and the ending internal bucket is based on a number of the plurality of external buckets and the range, wherein the number of the plurality of external buckets is determined from a parameter in a command to create the histogram. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A computer usable storage medium comprising computer readable program code embodied therein for causing a computer system to perform a method, the method comprising:
-
specifying a plurality of internal buckets, wherein each internal bucket of the plurality of internal buckets represent values between an internal minimum value and an internal maximum value, wherein a plurality of differences of the internal minimum value and the internal maximum value of each internal bucket are heterogeneous; populating the plurality of internal buckets with a plurality of data elements based on the internal minimum value and the internal maximum value of each internal bucket to obtain a plurality of populated internal buckets; and creating a histogram comprising a plurality of external buckets, wherein each of the plurality of external buckets comprises a total sum of at least two of the plurality of populated internal buckets, wherein creating the histogram comprises; identifying a minimum value and a maximum value for the plurality of data elements, calculating a range according to the minimum value and the maximum value, dividing the range into the plurality of external buckets, for each of the plurality of external buckets; calculating a staffing internal bucket of the plurality of internal buckets and an ending internal bucket of the plurality of internal buckets, summing each value of a subset of the plurality of internal buckets between the staffing internal bucket and ending internal bucket to create a total sum, and outputting the total sum, wherein calculating the staffing internal bucket and the ending internal bucket is based on a number of the plurality of external buckets and the range, wherein the number of the plurality of external buckets is determined from a parameter in a command to create the histogram. - View Dependent Claims (7, 8, 9, 10)
-
Specification