×

Scalable system for clustering of large databases having mixed data attributes

  • US 6,581,058 B1
  • Filed: 01/31/2001
  • Issued: 06/17/2003
  • Est. Priority Date: 05/22/1998
  • Status: Expired due to Term
First Claim
Patent Images

1. In a computer data processing system, a method for clustering data in a database comprising the steps of:

  • a) reading data records having both discrete and ordered attributes from a database storage medium and bringing a portion of the data records into a rapid access memory;

    b) initializing a cluster model that characterizes the data within the database wherein the cluster model includes a table of probabilities for the enumerated or discrete data attributes of the data records for each cluster of a multiple number of clusters that make up the cluster model and wherein the cluster model for data attributes that are ordered comprises a mean and covariance for each cluster;

    c) updating the cluster model from the database records brought into the rapid access memory;

    d) summarizing at least some of the database records in the rapid access memory and storing a summarization within the rapid access memory;

    e) evaluating a criteria to determine if further data should be accessed from the database to further cluster data records in the database; and

    f) based on the evaluating step reading an additional number of records from the database storage medium and bringing said additional number of records into the rapid access memory for further updating of the cluster model.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×