System and method for selective incremental deferred constraint processing after bulk loading data
First Claim
1. In a database management system for managing a database containing data comprising:
- storage for storing data in said database;
an instruction processor for processing instructions for managing data stored in said database;
a constraint manager for managing constraints on said data stored in said database;
a method for deferred checking of data after bulk loading into said database for violation of at least one constraint comprising;
determining an appropriate procedure for constraint checking comprising;
determining whether constraint checking is required;
and if it is required, determining whether to implement full, or incremental checking for compliance with said at least one constraint; and
checking said data in accordance with said determined constraint checking procedure.
1 Assignment
0 Petitions
Accused Products
Abstract
The invention provides method and apparatus for use in a database management system for managing a database containing data, where the database has storage for storing data in the database, and has an instruction processor for processing instructions for managing data stored in the database. The database has a constraint manager for managing constraints on said data stored in the database. The invention provides efficient method and means for deferred checking of data after bulk loading into said database for violation of constraints by determining an appropriate procedure for constraint checking by determining whether constraint checking is required; and if it is required, determining whether to implement full, or incremental checking for compliance with said constraints; and then checking the data in the database in accordance with the determined constraint checking procedure.
193 Citations
63 Claims
-
1. In a database management system for managing a database containing data comprising:
-
storage for storing data in said database;
an instruction processor for processing instructions for managing data stored in said database;
a constraint manager for managing constraints on said data stored in said database;
a method for deferred checking of data after bulk loading into said database for violation of at least one constraint comprising;
determining an appropriate procedure for constraint checking comprising;
determining whether constraint checking is required;
and if it is required, determining whether to implement full, or incremental checking for compliance with said at least one constraint; and
checking said data in accordance with said determined constraint checking procedure. - View Dependent Claims (2, 3, 39)
placing said database in a check pending state before checking;
determining an appropriate procedure for constraint checking as claimed in claim 1, checking said data in accordance with said determined constraint checking procedure, bringing said database out of said cheek pending state after checking said data.
-
-
3. An article of manufacture for use in a computer system comprising a computer readable medium for storing statements or instructions for use in execution in a computer in accordance with the method of claim 1.
-
39. The article of claim 38 wherein said program code routines embodied in said computer usable medium for causing the computer to effect deferred checking of data after bulk loading into said database are adapted to check for violation of at least one constraint, comprising routines for:
-
placing said database in a check pending state before checking;
determining an appropriate procedure for constraint checking as claimed in claim 1, checking said data in accordance with said determined constraint checking procedure, bringing said database out of said check pending state after checking said data.
-
-
4. In a relational database management system for managing a database containing data comprising:
-
memory storage for storing at least one table having a plurality of data records;
each of said data records being uniquely identifiable;
an instruction processor for processing instructions for managing said data records in said database;
a constraint manager for managing constraints, including table constraints and referential integrity constraints, on said data records in said database;
a method for deferred checking of data in a table after bulk loading into said database for violation of said constraints comprising;
placing said table into a check pending state before checking;
determining an appropriate procedure for constraint checking comprising;
determining whether constraint checking is required; and
if it is required, determining whether to implement full, or incremental checking for compliance with said constraints;
wherein said full checking comprises checking all data in a table including any preexisting data and bulk loaded data;
and wherein incremental checking comprises checking only that data in a table which was appended to said table in bulk loading;
checking said table in accordance with said determined constraint checking procedure; and
bringing said table out of check pending state when constraint checking determines that no violation is present. - View Dependent Claims (5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37)
(a) recording load operation information (Load.Opr) including, whether bulk loaded data was appended to said table, or whether all data in said table, including data previously present in said table and new data, was replaced by bulk data; and
,(b) setting selected constraint flags of said database management system;
said selected constraint flags being set when constraints require checking and are not set when constraints have either previously been checked or otherwise do not require checking;
wherein said determining of said appropriate procedure for constraint checking for a table includes using the following information;
said recorded load operation information and whether said selected constraint flags of said database management system are set.
-
-
6. The method of claim 5 wherein said selected constraint flags comprise:
at least one of a referential integrity constraint (foreign key) flag, a table check constraint flag, and a forced full constraint checking flag is set for said table.
-
7. The method of claim 6 wherein, in the process of a sequential series of bulk data load operations on a table, checking is deferred until all loading of data has occurred, wherein said recorded load operation information on a table is set to the most expensive load operation of said series of bulk data load operations performed on said table.
-
8. The method of claim 6 for deferred constraint checking of a table implemented in table partitions across a database, comprising:
-
detecting if any flag indicates that a type of constraint has not already been checked is set, and if set checking said table for violation of said constraint;
detecting if a forced full flag is set, and if set checking all data records of said table for constraint violations;
detecting if data records in any table partition have been replaced, and if so checking all data records of said table for constraint violations; and
detecting if any data has been inserted into a table partition without the replacement of data in any table partitions, and if so inserted without replacement checking for constraint violation by incremental checking.
-
-
9. The method of claim 8 comprising:
checking whether any data was loaded in bulk loading, and if no data was loaded then preventing constraint checking.
-
10. The method of claim 6 wherein said instruction processor comprises a query processor for processing queries.
-
11. The method of claim 10 wherein said instruction processor comprises a query compiler for compiling queries and a query processor for processing queries in said database.
-
12. The method of claim 5 for deferred constraint checking of a table implemented in table partitions across a database, comprising:
-
detecting if any flag indicates that a type of constraint has not already been checked is set, and if set checking said table for violation of said constraint;
detecting if a forced full flag is set, and if set checking all data records of said table for constraint violations;
detecting if data records in any table partition have been replaced, and if so checking all data records of said table for constraint violations; and
detecting if any data has been inserted into a table partition without the replacement of data in any table partitions, and if so inserted without replacement checking for constraint violation by incremental checking.
-
-
13. The method of claim 12 comprising:
checking whether any data was loaded in bulk loading, and if no data was loaded then preventing constraint checking.
-
14. The method of claim 12 wherein a plurality of tables require checking, namely a parent table and at least one descendent table;
-
wherein if full constraint checking procedure is chosen for checking a parent table using a set constraint checking command listing parent and descendent tables said method comprises;
selecting full constraint checking procedure for said parent and descendent tables listed in said set constraint command, and for any descendent tables not listed in said set constraint command, marking a force-full flag to indicate that full processing will be required subsequently for said descendent tables in order to complete constraint checking before bringing said descendent tables out of check pending state.
-
-
15. An article of manufacture for use in a computer system comprising a computer readable medium for storing statements or instructions for use in execution in a computer in accordance with the method of claim 12.
-
16. An article of manufacture for use in a computer system comprising a computer readable medium for storing statements or instructions for use in execution in a computer in accordance with the method of claim 5.
-
17. The method of claim 5 wherein newly loaded data is stored contiguously in a table during bulk loading and wherein each data record of said data is uniquely identified with a record identifier (RID), a starting RID being assigned to the first record of data bulk loaded.
-
18. The method of claim 17 wherein for newly loaded data, in the case of append mode loading being used, comprising:
-
storing only the starting RID of said appended portion of said data on said table, said newly loaded data being appended to the end of previously stored data in said table.
-
-
19. The method of claim 5 wherein said database comprises a relational database having a plurality of nodes;
- wherein a table in said database is partitioned into partitions in said database, newly loaded data is stored contiguously in a table partition wherein during bulk loading, in the case of appended inserted mode loading being used, storing only the starting RID of said appended portion of said data on said table partition, said newly loaded data being appended to the end of previously stored data in said table partition.
-
20. The method of claim 5 wherein said instruction processor comprises a query processor for processing queries.
-
21. The method of claim 20 wherein said instruction processor comprises a query compiler for compiling queries and a query processor for processing queries in said database.
-
22. An article of manufacture for use in a computer system comprising a computer readable medium for storing statements or instructions for use in execution in a computer in accordance with the method of claim 5.
-
23. The method of claim 4 wherein newly loaded data is stored contiguously in a table during bulk loading and wherein each data record of said data is uniquely identified with a record identifier (RID), a starting RID being assigned to the first record of data bulk loaded.
-
24. The method of claim 23 wherein for newly loaded data, in the case of append mode loading being used, comprising:
-
storing only the starting RID of said appended portion of said data on said table;
said newly loaded data being appended to the end of previously stored data in said table.
-
-
25. The method of claim 4 wherein said database comprises a relational database having a plurality of nodes;
- wherein a table in said database is partitioned into partitions in said database, newly loaded data is stored contiguously in a table partition wherein during bulk loading, in the case of appended inserted mode loading being used, storing only the starting RID of said appended portion of said data on said table partition, said newly loaded data being appended to the end of previously stored data in said table partition.
-
26. The method of claim 25 if incremental checking procedure is selected for constraint checking applying a RIDGE filter to select only newly appended data for checking.
-
27. The method of claim 26 wherein said instruction processor of said database management system includes an instruction optimizer comprising:
setting said RIDGE filter as optional in said database management system, using instruction optimizer to select an optimum processing procedure for accessing desired data records of said table to be checked for constraint violation.
-
28. The article of manufacture of claim 26 wherein said instruction processor comprises a query processor for processing queries.
-
29. The article of manufacture of claim 28 wherein said instruction processor comprises a query compiler for compiling queries and a query processor for processing queries in said database.
-
30. The method of claim 27 comprising:
-
recording the amount of data such as a page count of data loaded;
using said recorded amount by said query optimizer in decision making to determine an optimum processing procedure for checking constraint violation.
-
-
31. The article of manufacture of claim 30 wherein said instruction processor comprises a query processor for processing queries.
-
32. The article of manufacture of claim 31 wherein said instruction processor comprises a query compiler for compiling queries and a query processor for processing queries in said database.
-
33. The article of manufacture of claim 27 wherein said instruction processor comprises a query processor for processing queries.
-
34. The article of manufacture of claim 33 wherein said instruction processor comprises a query compiler for compiling queries and a query processor for processing queries in said database.
-
35. The method of claim 4 wherein said instruction processor comprises a query processor for processing queries.
-
36. The method of claim 35 wherein said instruction processor comprises a query compiler for compiling queries and a query processor for processing queries in said database.
-
37. An article of manufacture for use in a computer system comprising a computer readable medium for storing statements or instructions for use in execution in a computer in accordance with the method of claim 4.
-
38. An article of manufacture for use in a computer system in a database management system for managing a database containing data having:
-
storage for storing data in said database;
an instruction processor for processing instructions for managing data stored in said database;
a constraint manager for managing constraints on said data stored in said database;
said article of manufacture comprising a computer usable medium having computer readable program code routines embodied in said medium including;
a program code routines embodied in said computer usable medium for causing the computer to effect deferred checking of data after bulk loading into said database for violation of at least one constraint comprising;
determining an appropriate procedure for constraint checking comprising;
determining whether constraint checking is required;
and if it is required, determining whether to implement full, or incremental checking for compliance with said at least one constraint; and
checking said data in accordance with said determined constraint checking procedure.
-
-
40. For a relational database management system in a computer system for managing a database containing data having:
-
memory storage for storing at least one table having a plurality of data records;
each of said data records being uniquely identifiable;
an instruction processor for processing instructions for managing said data records in said database;
a constraint manager for managing constraints, including table constraints and referential integrity constraints, on said data records in said database;
an article of manufacture comprising a computer usable medium having computer readable program code means embodied in said medium including program code routines embodied in said compute usable medium for causing the computer to effect deferred checking of data in a table after bulk loading into said database for violation of said constraints, comprising routines for;
placing said table into a check pending state before checking;
determining an appropriate procedure for constraint checking comprising;
determining whether constraint checking is required;
and if it is required, determining whether to implement full, or incremental checking for compliance with said constraints;
wherein said full checking comprises checking all data in a table including any preexisting data and bulk loaded data;
and wherein incremental checking comprises checking only that data in a table which was appended to said table in bulk loading;
checking said table in accordance with said determined constraint checking procedure; and
after checking said table, bringing said table out of said check pending state into a normal state if no constraint violation is found. - View Dependent Claims (41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63)
(a) recording load operation information (Load.Opr) including, whether bulk loaded data was appended to said table, or whether all data in said table, including data previously present in said table and new data, was replaced by bulk data; and
,(b) setting selected constraint flags of said database management system;
said selected constraint flags being set when constraints require checking and are not set when constraints have either previously been checked or otherwise do not require checking;
wherein said determining of said appropriate procedure for constraint checking for a table includes using the following information;
said recorded load operation information and whether said selected constraint flags of said database management system are set.
-
-
42. The article of manufacture of claim 41 wherein said selected constraint flags comprise:
at least one of a referential integrity constraint (foreign key) flag, a table check constraint flag, and a forced full constraint checking flag is set for said table.
-
43. The article of manufacture of claim 42 wherein, in the process of a sequential series of bulk data load operations on a table, said computer program code routines are adapted so that checking is deferred until all loading of data has occurred, wherein said recorded load operation information on a table is set to the most expensive load operation of said series of bulk data load operations performed on said table.
-
44. The article of manufacture of claim 42 for deferred constraint checking of a table implemented in table partitions across a database, comprising computer program code routines for:
-
detecting if any flag indicates that a type of constraint has not already been checked is set, and if set checking said table for violation of said constraint;
detecting if a forced full flag is set, and if set checking all data records of said table for constraint violations;
detecting if data records in any table partition have been replaced, and if so checking all data records of said table for constraint violations; and
detecting if any data has been inserted into a table partition without the replacement of data in any table partitions, and if so inserted without replacement checking for constraint violation by incremental checking.
-
-
45. The article of manufacture of claim 44 comprising computer program code routines for:
checking whether any data was loaded in bulk loading, and if no data was loaded then preventing constraint checking.
-
46. The article of manufacture of claim 42 wherein a plurality of tables require checking, namely a parent table and at least one descendent table comprising computer program code routines for:
-
if full constraint checking, procedure is chosen for checking a parent table using a set constraint checking command listing parent and descendent tables said method comprises;
selecting full constraint checking procedure for said parent and descendent tables listed in said set constraint command, and for any descendent tables not listed in said set constraint command, marking a force-full flag to indicate that full processing will be required subsequently for said descendent tables in order to complete constraint checking before bringing said descendent tables out of check pending state.
-
-
47. The article of manufacture of claim 41 for deferred constraint checking of a table implemented in table partitions across a database, comprising computer program code routines for:
-
detecting if any flag indicates that a type of constraint has not already been checked is set, and if set checking said table for violation of said constraint;
detecting if a forced full flag is set, and if set checking all data records of said table for constraint violations;
detecting if data records in any table partition have been replaced, and if so checking all data records of said table for constraint violations; and
detecting if any data has been inserted into a table partition without the replacement of data in any table partitions, and if so inserted without replacement checking for constraint violation by incremental checking.
-
-
48. The article of manufacture of claim 47 comprising computer program code routines for:
checking whether any data was loaded in bulk loading, and if no data was loaded then preventing constraint checking.
-
49. The article of manufacture of claim 41 wherein a plurality of tables require checking, namely a parent table and at least one descendent table comprising computer program code routines for:
-
if full constraint checking, procedure is chosen for checking a parent table using a set constraint checking command listing parent and descendent tables said method comprises;
selecting full constraint checking procedure for said parent and descendent tables listed in said set constraint command, and for any descendent tables not listed in said set constraint command, marking a force-full flag to indicate that full processing will be required subsequently for said descendent tables in order to complete constraint checking before bringing said descendent tables out of check pending state.
-
-
50. The article of manufacture of claim 41 wherein said computer program code routines are adapted to store newly loaded data contiguously in a table during bulk loading and wherein each data record of said data is uniquely identified with a record identifier (RID), a starting RID being assigned to the first record of data bulk loaded.
-
51. The article of manufacture of claim 50 wherein for newly loaded data, in the case of append mode loading being used, comprising computer program code routines for:
-
storing only the starting RID of said appended portion of said data on said table;
said newly loaded data being appended to the end of previously stored data in said table.
-
-
52. The article of manufacture of claim 41 wherein said database comprises a relational database having a plurality of nodes;
- wherein a table in said database is partitioned into partitions in said database, newly loaded data is stored contiguously in a table partition wherein said computer program code routines are adapted so that during bulk loading, in the case of appended inserted mode loading being used, storing only the starting RID of said appended portion of said data on said table partition, said newly loaded data being appended to the end of previously stored data in said table partition.
-
53. The article of manufacture of claim 52 if incremental checking procedure is selected for constraint checking said computer program code routines are adapted for applying a RIDGE filter to select only newly appended data for checking.
-
54. The article of manufacture of claim 53 wherein said instruction processor of said database management system includes an instruction optimizer comprising:
setting said RIDGE filter as optional in said database management system, and said computer program code routines are adapted to use said instruction optimizer to select an optimum processing procedure for accessing desired data records of said table to be checked for constraint violation.
-
55. The article of manufacture of claim 54 comprising computer program code routines for:
-
recording the amount of data such as a page count of data loaded;
using said recorded amount by said query optimizer in decision making to determine an optimum processing procedure for checking constraint violation.
-
-
56. The article of manufacture of claim 52 wherein a plurality of tables require checking, namely a parent table and at least one descendent table comprising computer program code routines for:
-
if full constraint checking, procedure is chosen for checking a parent table using a set constraint checking command listing parent and descendent tables said method comprises;
selecting full constraint checking procedure for said parent and descendent tables listed in said set constraint command, and for any descendent tables not listed in said set constraint command, marking a force-full flag to indicate that full processing will be required subsequently for said descendent tables in order to complete constraint checking before bringing said descendent tables out of check pending state.
-
-
57. The article of manufacture of claim 40 wherein said computer program code routines are adapted to store newly loaded data contiguously in a table during bulk loading and wherein each data record of said data is uniquely identified with a record identifier (RID), a starting RID being assigned to the first record of data bulk loaded.
-
58. The article of manufacture of claim 57 wherein for newly loaded data, in the case of append mode loading being used, comprising computer program code routines for:
-
storing only the starting RID of said appended portion of said data on said table;
said newly loaded data being appended to the end of previously stored data in said table.
-
-
59. The article of manufacture of claim 40 wherein said database comprises a relational database having a plurality of nodes;
- wherein a table in said database is partitioned into partitions in said database, newly loaded data is stored contiguously in a table partition wherein said computer program code routines are adapted so that during bulk loading, in the case of appended inserted mode loading being used, storing only the starting RID of said appended portion of said data on said table partition, said newly loaded data being appended to the end of previously stored data in said table partition.
-
60. The article of manufacture of claim 59 if incremental checking procedure is selected for constraint checking said computer program code routines are adapted for applying a RIDGE filter to select only newly appended data for checking.
-
61. The article of manufacture of claim 60 wherein said instruction processor of said database management system includes an instruction optimizer comprising:
setting said RIDGE filter as optional in said database management system, and said computer program code routines are adapted to use said instruction optimizer to select an optimum processing procedure for accessing desired data records of said table to be checked for constraint violation.
-
62. The article of manufacture of claim 61 comprising computer program code routines for:
-
recording the amount of data such as a page count of data loaded;
using said recorded amount by said query optimizer in decision making to determine an optimum processing procedure for checking constraint violation.
-
-
63. The article of manufacture of claim 59 wherein a plurality of tables require checking, namely a parent table and at least one descendent table comprising computer program code routines for:
-
if full constraint checking, procedure is chosen for checking a parent table using a set constraint checking command listing parent and descendent tables said method comprises;
selecting full constraint checking procedure for said parent and descendent tables listed in said set constraint command, and for any descendent tables not listed in said set constraint command, marking a force-full flag to indicate that full processing will be required subsequently for said descendent tables in order to complete constraint checking before bringing said descendent tables out of check pending state.
-
Specification