×

System and method for sequence-based subspace pattern clustering

  • US 7,565,346 B2
  • Filed: 05/31/2004
  • Issued: 07/21/2009
  • Est. Priority Date: 05/31/2004
  • Status: Expired due to Fees
First Claim
Patent Images

1. An apparatus for facilitating subspace clustering, said apparatus comprising:

  • a processor;

    an arrangement for accepting input data;

    an arrangement for discerning pattern similarity in the input data, wherein said discerning arrangement is configured to;

    define a pattern space;

    divide the pattern space into grids;

    establish a grid of cells corresponding to the input data; and

    construct a tree structure which summarizes frequent patterns discerned among the input data, wherein said tree structure is configured to determine at least one of;

    a number of occurrences of a given pattern; and

    a density of any cell in the grid of cells; and

    an arrangement for clustering the input data on the basis of discerned pattern similarity, wherein said clustering arrangement is configured to merge cells of at least a threshold density into clusters;

    said arrangement for discerning pattern similarity comprising an arrangement for discerning pattern similarity among both tabular data and sequential data contained in the input data, wherein said tabular data is transformed and represented as sequential data;

    wherein said arrangement for discerning pattern similarity is configured to employ a distance function for determining a sequence-based distance between data objects, the distance function comprising;

    given two data objects x and y, a subspace S, and a dimension kε

    S, the sequence-based distance between x and y is as follows;

    dist k , S

    ( x , y )
    = max i

    ??




    ( x i - y i ) - ( x k - y k )

    ;

    and
    wherein the clustered input data is stored in a computer memory.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×