ESTIMATION OF UNIQUE DATABASE VALUES
First Claim
1. A method for estimating unique values in a database, comprising:
- providing a data field in the database, the data field comprising a plurality of information values;
partitioning the data field into a plurality of intervals, each interval comprising a range of information values from the plurality of information values;
computing an interval specific Bloom filter for each of the plurality of intervals using a processor;
calculating a binary Bloom filter value for an information value within an interval specific Bloom filter, wherein the binary Bloom filter value represents whether information values are unique; and
determining a number of unique values in the database based on calculated binary Bloom filter values.
2 Assignments
0 Petitions
Accused Products
Abstract
Estimation of unique values in a database can be performed where a data field having multiple information values is provided in the database. The data field can be partitioned into multiple intervals such that each interval includes a range of information values. An interval specific Bloom filter can be calculated for each of the multiple intervals. A binary Bloom filter value can be calculated for an information value within an interval specific Bloom filter. The binary Bloom filter value can represent whether the information value is unique. A number of unique values in the database can be determined based on calculated binary Bloom filter values.
-
Citations
15 Claims
-
1. A method for estimating unique values in a database, comprising:
-
providing a data field in the database, the data field comprising a plurality of information values; partitioning the data field into a plurality of intervals, each interval comprising a range of information values from the plurality of information values; computing an interval specific Bloom filter for each of the plurality of intervals using a processor; calculating a binary Bloom filter value for an information value within an interval specific Bloom filter, wherein the binary Bloom filter value represents whether information values are unique; and determining a number of unique values in the database based on calculated binary Bloom filter values. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system for estimating unique values in a database, comprising:
-
a database in communication with the information server and configured to store the information values received over a computing network; a data loader configure load the information values from the computing network to the database to be stored; a data partitioning module configured to partition the information values into intervals; a Bloom filter module configured to compute an interval specific Bloom filter for the intervals; a unique value calculation module configured to calculate a binary Bloom filter value for an information value within an interval specific Bloom filter, wherein the binary Bloom filter value represents whether the information value is unique; and an estimation module configured to estimate a number of unique values in the database based on calculated binary Bloom filter values. - View Dependent Claims (9, 10, 11, 12)
-
-
13. A method for estimating unique values in a database, comprising:
-
storing a data field in the database, the data field comprising a plurality of information values; partitioning the data field into a plurality of intervals, each interval comprising a range of information values from the plurality of information values; applying a hash function on an information value in an interval; evaluating whether a result of the hash function returns a value of zero; determining whether the information value when subjected to a Bloom filter is a unique value based on the result of the hash function returning a value of one; calculating a false positive probability of the information value being a unique value. - View Dependent Claims (14, 15)
-
Specification