Data base optimizer using most frequency values statistics

US 4,956,774 A
Filed: 09/02/1988
Issued: 09/11/1990
Est. Priority Date: 09/02/1988
Status: Expired due to Fees

First Claim

Patent Images

1. In a method for accessing data of a relational data base management system having at least one index, the improvement characterized by the steps performed by a computer of:

(a) selecting a number of most frequently occurring values of at least part of a key of the index, the number being greater than zero and less than a total number of such values;

(b) collecting frequency of occurrence statistics for the selected most frequently occurring values of the index;

(c) estimating a time required for using the index as the access path, based at least in part on the collected frequency of occurrence statistics;

(d) selecting an access path based at least in part on the estimated time; and

(e) accessing the data using the selected access path.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for more accurately estimating the time required to process a data base query using a selected index. A selected number of the most frequently occurring index key values (38) are collected during an index sequential scan. These most frequency occurring values are stored as percentage frequencies of occurrence in the data base system'"'"'s catalog (42). Estimated access and processing times (NPAR, NPAS, NCPU) for a given query are calculated based on the stored frequencies where possible. Where the query'"'"'s search criteria specify values other than the stored most frequently occurring values, those values are assumed to be uniformly distributed.

Citations

9 Claims

1. In a method for accessing data of a relational data base management system having at least one index, the improvement characterized by the steps performed by a computer of:
- (a) selecting a number of most frequently occurring values of at least part of a key of the index, the number being greater than zero and less than a total number of such values;
  
  (b) collecting frequency of occurrence statistics for the selected most frequently occurring values of the index;
  
  (c) estimating a time required for using the index as the access path, based at least in part on the collected frequency of occurrence statistics;
  
  (d) selecting an access path based at least in part on the estimated time; and
  
  (e) accessing the data using the selected access path.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, wherein frequencies of occurrence of the values other than the collected most frequently occurring values are assumed to be uniformly distributed.
  - 3. The method of claim 1, wherein the statistics comprise the most frequently occurring values, and their respective percentage frequencies of occurrence.
  - 4. The method of claim 1, wherein the number of most frequently occurring values is selected so that a sum of frequencies of occurrence of the selected most frequently occurring values is greater than a selected threshold.
  - 5. The method of claim 1, wherein the number of most frequently occurring values is selected to include all values occurring with more than a selected threshold frequency as most frequently occurring values.
  - 6. The method of claim 1, wherein the collected statistics are stored within the data base management system.
  - 7. The method of claim 6, wherein the collected statistics are stored in a system catalog.
  - 8. The method of claim 1, wherein times required for random and sequential page accesses and for processing time are estimated separately, based at least in part on the index'"'"'s statistics.

9. A method for accessing data of a relational data base management system having at least one index, comprising the steps performed by a computer of:
- (a) selecting a number of most frequently occurring values of at least part of a key of the index, the number being greater than zero and less than a total number of values of said at least part of the key;
  
  (b) collecting frequency of occurrence statistics for the selected most frequently occurring values of the index;
  
  (c) storing the statistics in the system;
  
  (d) estimating a time required for using the index as the access path, based at least in part on the stored frequency of occurrence statistics;
  
  (e) selecting an access path based at least in part on the estimated time; and
  
  (f) accessing the data using the selected access path.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
International Business Machines Corporation
Original Assignee
International Business Machines Corporation
Inventors
Shibamiya, Akira, Zimowski, Melvin R.
Primary Examiner(s)
NOT, DEFINED

Application Number

US07/239,712
Time in Patent Office

739 Days
Field of Search

364/200, 364/300, 364/900
US Class Current

1/1
CPC Class Codes

G06F 16/24549   Run-time optimisation

G06F 16/9017   using directory or table lo...

Y10S 707/99932   Access augmentation or opti...

Y10S 707/99933   Query processing, i.e. sear...

Data base optimizer using most frequency values statistics

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

9 Claims

Specification

Solutions

Use Cases

Quick Links

Data base optimizer using most frequency values statistics

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

9 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links