HIERARCHICAL SEQUENTIAL CLUSTERING
First Claim
1. A method for hierarchically clustering sequential data that preserves the sequential information in the data, the method comprising:
- identifying pair-wise sequential matches between the plurality of sequences within the sequential data;
initializing a number of clusters represented by the plurality of sequences;
identifying a pair of sequences of the plurality of sequences that are closest to each other according to a distance measure; and
merging the identified pair of sequences into a single cluster.
1 Assignment
0 Petitions
Accused Products
Abstract
Embodiments of the invention provide systems and methods for analyzing sequential data. Analyzing the sequential data can include grouping or clustering data that are similar in some way, e.g., similar ranges of quantities, similar categories, etc. More specifically, a method for hierarchical clustering of sequential data can comprise creating a dotplot of the sequential data. The dotplot can represent a plurality of sequences within the sequential data. A number of clusters represented by the plurality of sequences can be initialized, e.g., one cluster per sequence. A pair of sequences of the plurality of sequences having a longest sequential match can be identified, e.g., based on a line fitting technique, and merged into a single cluster. Identifying a pair of sequences of the plurality of sequences having a longest sequential match and merging the identified pair of sequences into a single cluster can be repeated until a single cluster remains.
44 Citations
20 Claims
-
1. A method for hierarchically clustering sequential data that preserves the sequential information in the data, the method comprising:
-
identifying pair-wise sequential matches between the plurality of sequences within the sequential data; initializing a number of clusters represented by the plurality of sequences; identifying a pair of sequences of the plurality of sequences that are closest to each other according to a distance measure; and merging the identified pair of sequences into a single cluster. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A system comprising:
-
a processor; and a memory communicatively coupled with and readable by the processor and having stored therein a series of instructions which, when executed by the processor, cause the processor to hierarchically cluster sequential data by identifying pair-wise sequential matches between the plurality of sequences within the sequential data, initializing a number of clusters represented by the plurality of sequences, identifying a pair of sequences of the plurality of sequences that are closest to each other according to a distance measure, merging the identified pair of sequences into a single cluster, assigning an aggregate sequence to the single cluster, the aggregate sequence representing the sequences merged into the single cluster, and repeating identifying a pair of closest sequences of the plurality of sequences, merging the identified pair of sequences into a single cluster, and assigning an aggregate sequence to the cluster until a single cluster remains. - View Dependent Claims (15)
-
-
16. A machine-readable medium having stored therein a series of instructions which, when executed by a processor, cause the processor to hierarchically cluster sequential data by:
-
identifying pair-wise sequential matches between the plurality of sequences within the sequential data; initializing a number of clusters represented by the plurality of sequences; identifying a pair of sequences of the plurality of sequences that are closest to each other according to a distance measure; and merging the identified pair of sequences into a single cluster; assigning an aggregate sequence to the single cluster, the aggregate sequence representing the sequences merged into the single cluster; and repeating identifying a pair of closest sequences of the plurality of sequences, merging the identified pair of sequences into a single cluster, and assigning an aggregate sequence to the cluster until a single cluster remains. - View Dependent Claims (17, 18, 19, 20)
-
Specification