System and method for evolutionary clustering of sequential data sets
First Claim
1. A computer system for clustering a data set, comprising:
- a clustering engine for clustering at least one data set in a sequence of data sets as part of a series of clusterings of the data sets in the sequence;
a snapshot cost evaluator for determining a snapshot cost of clustering the at least one data set independently of the series of clusterings of the data sets in the sequence;
a history cost evaluator for determining a history cost of clustering the at least one data set as part of the series of clusterings of the data sets in the sequence; and
an overall cost evaluator for determining a cost of clustering the at least one data set by minimizing both the snapshot cost of clustering at least one data set independently of the series of clusterings of the data sets in the sequence and the history cost of clustering the at least one data set as part of the series of clusterings of the data sets in the sequence.
9 Assignments
0 Petitions
Accused Products
Abstract
An improved system and method for evolutionary clustering of sequential data sets is provided. A snapshot cost may be determined for representing the data set for a particular clustering method used and may determine the cost of clustering the data set independently of a series of clusterings of the data sets in the sequence. A history cost may also be determined for measuring the distance between corresponding clusters of the data set and the previous data set in the sequence of data sets to determine a cost of clustering the data set as part of a series of clusterings of the data sets in the sequence. An overall cost may be determined for clustering the data set by minimizing the combination of the snapshot cost and the history cost. Any clustering method may be used, including flat clustering and hierarchical clustering.
-
Citations
20 Claims
-
1. A computer system for clustering a data set, comprising:
-
a clustering engine for clustering at least one data set in a sequence of data sets as part of a series of clusterings of the data sets in the sequence;
a snapshot cost evaluator for determining a snapshot cost of clustering the at least one data set independently of the series of clusterings of the data sets in the sequence;
a history cost evaluator for determining a history cost of clustering the at least one data set as part of the series of clusterings of the data sets in the sequence; and
an overall cost evaluator for determining a cost of clustering the at least one data set by minimizing both the snapshot cost of clustering at least one data set independently of the series of clusterings of the data sets in the sequence and the history cost of clustering the at least one data set as part of the series of clusterings of the data sets in the sequence. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A computer-implemented method for clustering a data set, comprising:
-
determining an overall cost of clustering at least one data set in a sequence of data sets by minimizing the combination of a snapshot cost of clustering the at least one data set independently of the series of clusterings of the data sets in the sequence and a history cost of clustering the at least one data set as part of the series of clusterings of the data sets in the sequence; and
clustering the at least one data set in the sequence of data sets according to the overall cost determined by minimizing the combination of both the snapshot cost of clustering the at least one data set independently of the series of clusterings of the data sets in the sequence and the history cost of clustering the at least one data set as part of the series of clusterings of the data sets in the sequence. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A computer system for clustering a data set, comprising:
-
means for determining a snapshot cost of clustering at least one data set in a sequence of data sets independently of a series of clusterings of the data sets in the sequence;
means for determining a history cost of clustering the at least one data set in the sequence of data sets as part of the series of clusterings of the data sets in the sequence;
means for determining an overall cost of clustering the at least one data set in the sequence of data sets by minimizing both the snapshot cost of clustering the at least one data set independently of the series of clusterings of the data sets in the sequence and the history cost of clustering the at least one data set as part of the series of clusterings of the data sets in the sequence; and
means for clustering the at least one data set in the sequence of data sets according to the overall cost determined by minimizing both the snapshot cost of clustering the at least one data set independently of the series of clusterings of the data sets in the sequence and the history cost of clustering the at least one data set as part of the series of clusterings of the data sets in the sequence.
-
Specification