×

Population clustering through density-based merging

  • US 20110029519A1
  • Filed: 04/26/2004
  • Published: 02/03/2011
  • Est. Priority Date: 04/25/2003
  • Status: Active Grant
First Claim
Patent Images

1. A method of identifying clusters of data items, wherein a data item is associated with one or more values, said method using an information processing system with a logic processor comprising:

  • defining using a logic processor a plurality of lattice points in a data space able to contain some of said data items of interest, said lattice points being data reduction points, said lattice points spaced according to a lattice rule, wherein said lattice points provide a rules-based ordering of values and storing said plurality of lattice points in a memory of said information processing system;

    calculating weights for said lattice points using one or more of said data items according to a weighting rule by retrieving said plurality of said lattice points from said memory and performing said calculating using said processor;

    determining a density at said lattice points relating a weight at a lattice point to weights at nearby lattice points using a density function on said processor;

    creating, for each of a plurality of said lattice points, a directional association with at least one other of said lattice points using an association rule using said processor and storing said direction association in said memory;

    sequentially following directional associations between lattice points starting at a starting lattice point of said lattice points, thereby defining a lattice point path until a terminal lattice point for said lattice point path is reached wherein a terminal lattice point is identified when there is no surrounding point from which a lattice point can be connected using said processor;

    comparing a density of said terminal lattice point of said lattice point path to a density threshold thereby determining if said terminal lattice point indicates a new cluster, an existing cluster, or a default background cluster using said processor; and

    labeling lattice points in said lattice point path as background points, points belonging to said existing cluster;

    or points belonging to said new cluster based on said determining using said processor and storing said labeling in said memory;

    repeating said sequentially following, said comparing, and said labeling until a majority of lattice points are labeled using said processor;

    distance merging lattice point paths by comparing pairs of terminal lattice points and if a distance between two terminal lattice points is below a distance threshold, merging points in each lattice point path connected to said terminal lattice points into a single cluster, wherein said distance threshold is expressed as a distance between lattice points;

    density merging lattice point paths by comparing pairs of terminal lattice points and if there is a path of consecutive lattice points along which values of said lattice point'"'"'s densities do not fall more than a density threshold below the minimum values of the two roots;

    repeating said density merging on initial lattice point paths and merged lattice point paths until no pairs of lattice points meet a density merging criteria;

    assigning lattice points to a cluster or to background noise;

    assigning data points to clusters by relating data points to lattice points and then using the cluster assignment of the lattice point; and

    displaying one or more identified clusters to a user using a graphical user interface display at said system.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×