Method, system, and program for collecting statistics of data stored in a database
First Claim
1. A data processing system implemented method of collecting statistics associated with data stored in a database, the database operatively coupled to a data processing system, the data processing system implemented method comprising:
- assembling a list of tables of said data, said tables being scheduled for periodic automatic statistics collection;
for each of the tables, determining a likelihood that currently computed statistics associated with the data in each of the tables have changed;
removing from the list tables for which said determined likelihood is low;
collecting updated statistics for the data in tables which remain in said list after said removing; and
updating the scheduled periodic automatic statistics collection for each of the tables, said periodic automatic statistics collection comprising periodically performing subsequent collections of statistics associated with the data in each of the tables, wherein said updating comprises scheduling, for each of the tables, said subsequent collections of statistics more often or less often based on another likelihood that the updated statistics have changed for each of the tables,wherein said another likelihood that the updated statistics have changed is determined by comparing said updated statistics with said computed statistics.
5 Assignments
0 Petitions
Accused Products
Abstract
The present invention relates to collecting statistics automatically for data in a database. There is provided a method for automated statistics collection comprising determining a likelihood that statistics for data have changed; and collecting statistics for data in response to the likelihood. Indicators of the likelihood that statistics have changed may be useful to trigger automated statistics collection. Tables having statistics that change significantly may be collected more often than statistics of tables that are stable. A preferred model is provided to facilitate the collection of statistics that are more relevant: a table is scheduled for collection in accordance with observed patterns of table activity; a table is considered for collection if it meets a threshold level of activity; and a table is sampled to predict whether the statistics to be collected have changed. When collecting statistics, throttling and lock contention can minimize impact on a database user'"'"'s response experience.
20 Citations
27 Claims
-
1. A data processing system implemented method of collecting statistics associated with data stored in a database, the database operatively coupled to a data processing system, the data processing system implemented method comprising:
-
assembling a list of tables of said data, said tables being scheduled for periodic automatic statistics collection; for each of the tables, determining a likelihood that currently computed statistics associated with the data in each of the tables have changed; removing from the list tables for which said determined likelihood is low; collecting updated statistics for the data in tables which remain in said list after said removing; and updating the scheduled periodic automatic statistics collection for each of the tables, said periodic automatic statistics collection comprising periodically performing subsequent collections of statistics associated with the data in each of the tables, wherein said updating comprises scheduling, for each of the tables, said subsequent collections of statistics more often or less often based on another likelihood that the updated statistics have changed for each of the tables, wherein said another likelihood that the updated statistics have changed is determined by comparing said updated statistics with said computed statistics. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A data processing system for collecting statistics associated with data stored in a database, the database operatively coupled to the data processing system operating on a computer, the data processing system comprising:
-
a processor; an assembling module, executed on the processor, for assembling a list of tables of said data, said tables being scheduled for periodic automatic statistics collection; a determining module for determining, for each of the tables, a likelihood that currently computed statistics associated with the data have changed; a removing module for removing from the list tables for which said determined likelihood is low; a collecting module for collecting updated statistics for the data in tables which remain in said list after said removing module removes tables for which said determined likelihood is low; and an updating module for updating the scheduled periodic automatic statistics collection for each of the tables, said periodic automatic statistics collection comprising periodically performing subsequent collections of statistics associated with the data in each of the tables, wherein said updating comprises scheduling, for each of the tables, said subsequent collections of statistics more often or less often based on another likelihood that the updated statistics have changed for each of the tables, wherein said another likelihood that the updated statistics have changed is determined by comparing said updated statistics with said computed statistics. - View Dependent Claims (14, 15, 16, 17)
-
-
18. An article of manufacture for directing a data processing system to collect statistics associated with data stored in a database, the database operatively coupled to the data processing system, the article of manufacture comprising:
-
a program usable storage medium embodying one or more instructions executable by a processor of the data processing system, the one or more instructions comprising; data processing system executable instructions for assembling a list of tables of said data, said tables being scheduled for periodic automatic statistics collection; data processing system executable instructions for determining, for each of the tables, a likelihood that currently computed statistics associated with the data in each of the tables have changed; data processing system executable instructions for removing from the list tables for which said determined likelihood is low; data processing system executable instructions for collecting updated statistics for the data in tables which remain in said list after said removing; and data processing system executable instructions for updating the scheduled periodic automatic statistics collection for each of the tables, said periodic automatic statistics collection comprising periodically performing subsequent collections of statistics associated with the data in each of the tables, wherein said updating comprises scheduling, for each of the tables, said subsequent collections of statistics more often or less often based on another likelihood that the updated statistics have changed for each of the tables, wherein said another likelihood that the updated statistics have changed is determined by comparing said updated statistics with said computed statistics. - View Dependent Claims (19, 20, 21, 22, 23, 24, 25, 26, 27)
-
Specification