CARDINALITY ESTIMATION IN DATABASE SYSTEMS USING SAMPLE VIEWS
First Claim
1. A system for estimating the results of a data analysis operation, comprising:
- an sample view component that creates one or more sample views representing data on which the data analysis operation is intended to be performed, the one or more sample views contains a random sample of the data; and
an estimation component that performs an approximation of the data analysis operation on the one or more sample views to produce an estimated result of performing the data analysis operation on the data.
2 Assignments
0 Petitions
Accused Products
Abstract
A system and method that facilitates and effectuates estimating the result of performing a data analysis operation on a set of data. Employing an approximation of the data analysis operation on a statistically valid random sample view of the data allows for a statistically accurate estimate of the result to be obtained. Sequential sampling in the view enables the approximated operation to evaluate accuracy conditions at intervals during the scan of the sample view and obtain the estimated result without having to scan the entire sample view. Feedback regarding the accuracy of the estimated result can be captured when the data analysis operation is performed against the set of data. Process control techniques can be employed with the feedback to maintain the statistical validity of the sample view.
-
Citations
20 Claims
-
1. A system for estimating the results of a data analysis operation, comprising:
-
an sample view component that creates one or more sample views representing data on which the data analysis operation is intended to be performed, the one or more sample views contains a random sample of the data; and an estimation component that performs an approximation of the data analysis operation on the one or more sample views to produce an estimated result of performing the data analysis operation on the data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A method for estimating the results of a data analysis operation, comprising:
-
creating one or more sample views representing data on which the data analysis operation is intended to be performed, the one or more sample views contains a random sample of the data; and performing an approximation of the data analysis operation on the one or more sample views to produce an estimated result of performing the data analysis operation on the data. - View Dependent Claims (13, 14, 15, 16, 17, 18)
-
-
19. A system for estimating the results of a data analysis operation, comprising:
-
means for creating one or more sample views representing data on which the data analysis operation is intended to be performed, the one or more sample views contains a random sample of the data; and means for performing an approximation of the data analysis operation on the one or more sample views to produce an estimated result of performing the data analysis operation on the data. - View Dependent Claims (20)
-
Specification