Methods and Apparatus for Clustering Evolving Data Streams Through Online and Offline Components
First Claim
Patent Images
1. A method of clustering data of a data stream comprising the steps of:
- creating online statistics from the data stream in accordance with one or more groups of similar data points from the data stream, wherein the online statistics comprise lower-level clusters; and
performing offline processing of the online statistics through at least one re-clustering of the one or more groups of similar data points around at least one sampled pseudo-point to create higher-level clusters, when offline processing is one of required and desired.
0 Assignments
0 Petitions
Accused Products
Abstract
A technique of clustering data of a data stream is provided. Online statistics are first created from the data stream. Offline processing of the online statistics is then performed when offline processing either required or desired. Online statistics may be created through the reception of data points from the data stream and the formation and updating of data groups. Offline processing may be performed by reclustering groups of data points around sampled data points and reporting the newly formed clusters.
19 Citations
21 Claims
-
1. A method of clustering data of a data stream comprising the steps of:
-
creating online statistics from the data stream in accordance with one or more groups of similar data points from the data stream, wherein the online statistics comprise lower-level clusters; and
performing offline processing of the online statistics through at least one re-clustering of the one or more groups of similar data points around at least one sampled pseudo-point to create higher-level clusters, when offline processing is one of required and desired. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. Apparatus for clustering data of data stream, the apparatus comprising:
-
a memory; and
at least one processor coupled to the memory and operative to;
(i) create online statistics from the data stream in accordance with one or more groups of similar data points from the data stream, wherein the online statistics comprise lower-level clusters; and
(ii) perform offline processing of the online statistics through at least one re-clustering of the one or more groups of similar data points around at least one sampled pseudo-point to create higher-level clusters, when offline processing is one of required and desired. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. An article of manufacture for clustering data of a data stream, comprising a machine readable medium containing one or more programs which when executed implement the steps of:
-
creating online statistics from the data stream in accordance with one or more groups of similar data points from the data stream, wherein the online statistics comprise lower-level clusters; and
performing offline processing of the online statistics through at least one re-clustering of the one or more groups of similar data points around at least one sampled pseudo-point to create higher-level clusters, when offline processing is one of required and desired.
-
Specification