Random sampling as a built-in function for database administration and replication
First Claim
1. A method for administration and replication of a database, comprising the steps of:
- providing a database management system with a built-in random sampling facility integrated into said database management system; and
, executing said random sampling facility from within the database management system to perform a replication operation on said database.
1 Assignment
0 Petitions
Accused Products
Abstract
A database management system and method for administration and replication having a built-in random sampling facility for approximation partition analysis on very large databases. The method utilizes a random sampling algorithm that provides results accurate to within a few percentage points for large homogeneous databases. The accuracy is not affected by the size of the database and is determined primarily by the size of the sample. The system and method for approximate partition analysis reduces the time required for an analysis to a fraction of the time required for an exact analysis. The database management system is configured with the random sampling facility built-in thereby enabling even greater efficiency by reducing communication overhead between an analysis program and the database management system to a fraction of the overhead required when sampling is performed by a separate analysis program. The reduction in time thereby permits frequent and timely analyses for replication and administration of database partitions.
-
Citations
24 Claims
-
1. A method for administration and replication of a database, comprising the steps of:
-
providing a database management system with a built-in random sampling facility integrated into said database management system; and
,executing said random sampling facility from within the database management system to perform a replication operation on said database. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method for database administration and replication, comprising the steps of:
-
providing a database management system with an integrated random sampling facility;
selecting a default sample size value S;
selectively receiving a desired sample size value D and setting said default sample size value S to said desired sample size value D when said desired sample size value D is received;
randomly sampling S records of the database using said random sampling facility;
storing statistics for each of said S records, wherein said statistics include a record key for each record; and
,producing at least one of;
an extrapolated replication partition analysis based on said statistics; and
a partial replication partition analysis based on said statistics. - View Dependent Claims (7, 8, 9, 10, 11, 12, 13, 14, 16, 17, 18, 19, 20, 21, 22, 23, 24)
-
-
15. A database management system (DBMS) for managing an associated database, the DBMS comprising:
-
random sampling facility integrated with the database management system;
first database analysis tools using said integrated random sampling facility for generating extrapolated reports on database content;
second database analysis tools using said integrated random sampling facility for generating extrapolated reports on database size; and
,database replication tools adapted to execute at least one of a complete replication having output partition sizes determined by extrapolating a random sample of said database, and a partial replication in which the data stored in the partial replication comprises a random sample of said database.
-
Specification