×

Method and system for clustering data in parallel in a distributed-memory multiprocessor system

  • US 6,269,376 B1
  • Filed: 10/26/1998
  • Issued: 07/31/2001
  • Est. Priority Date: 10/26/1998
  • Status: Expired due to Term
First Claim
Patent Images

1. A method of clustering a set of data points into k clusters, comprising:

  • (a) dividing the set of data points into P data blocks of substantially equal size, each data block assigned to one of P processors;

    (b) selecting k initial global centroids with a first processor and broadcasting the k initial global centroids from the first processor to the remaining P−

    1 processors;

    (c) computing the distance from each data point in each data block to the global centroid values by using the processor associated with the data block;

    (d) assigning each data point in each data block to a global centroid value closest to the data point by using the processor associated with the data block;

    (e) computing k block accumulation values in each block from the data points assigned thereto; and

    (f) recomputing the k global centroid values from the k block accumulation values computed for each data block.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×