×

Database analysis using clusters

  • US 8,161,048 B2
  • Filed: 05/08/2009
  • Issued: 04/17/2012
  • Est. Priority Date: 04/24/2009
  • Status: Active Grant
First Claim
Patent Images

1. A method for mapping relationships in a database, the database including a plurality of tables having a table join structure, wherein the table join structure is indicated by table join edges in a schema graph of the database and wherein each of the plurality of tables includes a corresponding set of records, the method comprising:

  • for each of the plurality of tables, grouping, by a computer system, a sample of the corresponding set of records into clusters, wherein records grouped in a cluster instantiate a common set of table join edges;

    identifying cluster pairs, wherein a cluster pair corresponds to two clusters from different tables, wherein the two clusters instantiate a common table join edge;

    weighting the cluster pairs according to a number of records that instantiate the common table join edge;

    filtering any cluster pairs weighted below a threshold weighting, wherein the filtering includes a process selected from excluding the cluster pairs weighted below the threshold and combining each cluster associated with each cluster pair weighted below the threshold weighting with another cluster;

    selecting a source cluster from a first table and a target cluster from a second table, wherein the first table and second tables are different tables;

    selecting a third table in the database, wherein the third table shares a table join edge with the first table; and

    determining a relative frequency, with respect to the first table, with which the second table is reached from the third table.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×