×

Database query optimization using clustering data mining

  • US 8,229,917 B1
  • Filed: 02/24/2011
  • Issued: 07/24/2012
  • Est. Priority Date: 02/24/2011
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method of optimizing a database query, said method comprising:

  • a computer system receiving a database table populated with data;

    said computer system scanning said database table;

    said computer system determining statistics and single column histograms that describe data included in single columns of said database table;

    said computer system estimating cardinality based on said statistics and said single column histograms;

    said computer system determining all possible correlations among multiple columns by performing clustering data mining, wherein one or more columns of said multiple columns are included in said database table, and wherein said performing clustering data mining includes segregating said data that populates said database table into a plurality of clusters, each cluster having a corresponding rule whose conditions are matched by one or more rows of said database table, and further having a corresponding support count that indicates a number of rows of said database table that satisfy said rule;

    said computer system ranking said multiple columns based on said determined correlations;

    said computer system determining top ranked columns of said multiple columns based on said ranking;

    said computer system determining said estimated cardinality differs from said corresponding support count by more than a threshold amount;

    in response to said determining said estimated cardinality differs from said corresponding support count by more than said threshold amount, said computer system determining multiple column histograms based on said top ranked columns; and

    said computer system generating an optimal query plan based on said multiple column histograms.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×