APPROXIMATE ORDER STATISTICS OF REAL NUMBERS IN GENERIC DATA
First Claim
1. A method for calculating approximate order statistics from a collection of real numbers comprising:
- inserting at least one of the collection of real numbers into a digest stored in a non-transitory memory, wherein the digest includes one or more buckets grouped into one or more levels, wherein a bucket is created for a real number upon insertion into the digest, and wherein the created bucket is added to a level associated with an ordinality of the inserted real number.compressing the digest by collapsing a bucket, at least one of the bucket'"'"'s siblings, and the bucket'"'"'s parent into the bucket'"'"'s parent when a sum of the counts of the bucket, at least one of the bucket'"'"'s siblings, and the bucket'"'"'s parent is less than a number of real numbers added to the digest divided by a constant; and
calculating an approximate order statistic for a query value by summing all of the counts of all buckets in the digest having a value less than the query value.
1 Assignment
0 Petitions
Accused Products
Abstract
A method, system, and processor-readable storage medium are directed towards calculating approximate order statistics on a collection of real numbers. In one embodiment, the collection of real numbers is processed to create a digest comprising hierarchy of buckets. Each bucket is assigned a real number N having P digits of precision and ordinality O. The hierarchy is defined by grouping buckets into levels, where each level contains all buckets of a given ordinality. Each individual bucket in the hierarchy defines a range of numbers—all numbers that, after being truncated to that bucket'"'"'s P digits of precision, are equal to that bucket'"'"'s N. Each bucket additionally maintains a count of how many numbers have fallen within that bucket'"'"'s range. Approximate order statistics may then be calculated by traversing the hierarchy and performing an operation on some or all of the ranges and counts associated with each bucket.
14 Citations
22 Claims
-
1. A method for calculating approximate order statistics from a collection of real numbers comprising:
-
inserting at least one of the collection of real numbers into a digest stored in a non-transitory memory, wherein the digest includes one or more buckets grouped into one or more levels, wherein a bucket is created for a real number upon insertion into the digest, and wherein the created bucket is added to a level associated with an ordinality of the inserted real number. compressing the digest by collapsing a bucket, at least one of the bucket'"'"'s siblings, and the bucket'"'"'s parent into the bucket'"'"'s parent when a sum of the counts of the bucket, at least one of the bucket'"'"'s siblings, and the bucket'"'"'s parent is less than a number of real numbers added to the digest divided by a constant; and calculating an approximate order statistic for a query value by summing all of the counts of all buckets in the digest having a value less than the query value. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. An apparatus for calculating approximate order statistics, comprising:
-
a processor; and a memory storing instructions that when executed by the processor cause actions to be performed, including; inserting at least one of the collection of real numbers into a digest, wherein the digest includes one or more buckets grouped into one or more levels, wherein a bucket is created for a real number upon insertion into the digest, and wherein the created bucket is added to a level associated with an ordinality of the inserted real number. compressing the digest by collapsing a bucket, at least one of the bucket'"'"'s siblings, and the bucket'"'"'s parent into the bucket'"'"'s parent when a sum of the counts of the bucket, at least one of the bucket'"'"'s siblings, and the bucket'"'"'s parent is less than a number of real numbers added to the digest divided by a constant; and calculating an approximate order statistic for a query value by summing all of the counts of all buckets in the digest having a value less than the query value. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
14. A non-transitory processor readable storage medium storing instructions that cause a processor to perform actions, comprising:
-
inserting at least one of the collection of real numbers into a digest, wherein the digest includes one or more buckets grouped into one or more levels, wherein a bucket is created for a real number upon insertion into the digest, and wherein the created bucket is added to a level associated with an ordinality of the inserted real number. compressing the digest by collapsing a bucket, at least one of the bucket'"'"'s siblings, and the bucket'"'"'s parent into the bucket'"'"'s parent when a sum of the counts of the bucket, at least one of the bucket'"'"'s siblings, and the bucket'"'"'s parent is less than a number of real numbers added to the digest divided by a constant; and calculating an approximate order statistic for a query value by summing all of the counts of all buckets in the digest having a value less than the query value. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
-
21. A system comprising:
-
a first computing device storing a first digest derived from a first collection of real numbers; a second computing device storing a second digest derived from a second collection of real numbers, wherein the first digest and the second digest each include one or more buckets grouped into one or more levels, wherein a bucket is created for a real number the first time that real number is added to the digest, and wherein the created bucket is added to a level associated with an ordinality of the inserted real number; and a third computing device configured to perform actions comprising; receiving the first digest and the second digest; merging the first digest and the second digest into a merged digest; and calculating an approximate order statistic for a query value by summing all of the counts of all buckets in the merged digest having a value less than the query value. - View Dependent Claims (22)
-
Specification