Automatic Consistent Sampling For Data Analysis
First Claim
1. A computer-implemented method of analyzing data within one or more databases, comprising:
- selecting one or more databases for analysis, each database comprising one or more database objects comprising one or more data values;
applying a function to each data value in each database object within the one or more databases, wherein the function produces function values limited to a predetermined range;
identifying for analysis the data values producing a certain function value within the predetermined range to form a sampled data set; and
analyzing the sampled data set to determine relationships between the database objects within and across the one or more databases.
1 Assignment
0 Petitions
Accused Products
Abstract
A method, computer program product, and system for analyzing data within one or more databases, comprising selecting one or more databases for analysis, each database comprising one or more database objects comprising one or more data values, applying a function to each data value in each database object within the one or more databases, where the function produces function values limited to a predetermined range, identifying for analysis the data values producing a certain function value within the predetermined range to form a sampled data set, and analyzing the sampled data set to determine relationships between the database objects within and across the one or more databases.
-
Citations
7 Claims
-
1. A computer-implemented method of analyzing data within one or more databases, comprising:
-
selecting one or more databases for analysis, each database comprising one or more database objects comprising one or more data values; applying a function to each data value in each database object within the one or more databases, wherein the function produces function values limited to a predetermined range; identifying for analysis the data values producing a certain function value within the predetermined range to form a sampled data set; and analyzing the sampled data set to determine relationships between the database objects within and across the one or more databases. - View Dependent Claims (3, 4, 5, 7)
-
-
2. The method of claim I, wherein said analysis further comprises determining one or more primary key-foreign key relationships between the database objects within and across the one or more databases.
-
6. The method of claim wherein the function is a hash function.
Specification