×

Vertical implementation of expectation-maximization algorithm in SQL for performing clustering in very large databases

  • US 6,519,591 B1
  • Filed: 12/22/2000
  • Issued: 02/11/2003
  • Est. Priority Date: 12/22/2000
  • Status: Active Grant
First Claim
Patent Images

1. A method for performing clustering within a relational database management system to group a set of n data points into a set of k clusters, each data point having a dimensionality p, the method comprising the steps of:

  • establishing a first table, C, having 1 column and p*k rows, for the storage of means values;

    establishing a second table, R, having 1 column and p rows, for the storage of covariance values;

    establishing a third table, W, having w columns and k rows, for the storage of w weight, values;

    establishing a fourth table, Y, having 1 column and p*n rows, for the storage of values; and

    executing a series of SQL commands implementing an Expectation-Maximization clustering algorithm to iteratively update the means values, covariance values and weight values stored in said first, second and third tables;

    said step of executing a series of SQL commands implementing an Expectation-Maximization clustering algorithm includes the step of calculating a Mahalanobis distance for each of said n data points by using SQL aggregate functions to join tables Y, C and R.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×