Fast Grouping of Time Series
First Claim
Patent Images
1. A method comprising:
- collecting a plurality of time-series, wherein each of the plurality of time-series comprises a series of numerical values;
generating a plurality of feature vectors, wherein each of the plurality of feature vectors corresponds to one of the plurality of time-series;
mapping between granularity levels and distance thresholds for clustering, based, at least in part, on a subset of the plurality of feature vectors;
generating a plurality of seeds that corresponds to one of the granularity levels, based, at least in part, on the mapping; and
assigning each of the plurality of time-series to one of the plurality of seeds.
2 Assignments
0 Petitions
Accused Products
Abstract
In some examples, a time-series data set can be analyzed and grouped in a fast and efficient manner. For instance, fast grouping of multiple time-series into clusters can be implemented through data reduction, determining cluster population, and fast matching by locality sensitive hashing. In some situations, a user can select a level of granularity for grouping time-series into clusters, which can involve trade-offs between the number of clusters and the maximum distance between two time-series in a cluster.
28 Citations
10 Claims
-
1. A method comprising:
-
collecting a plurality of time-series, wherein each of the plurality of time-series comprises a series of numerical values; generating a plurality of feature vectors, wherein each of the plurality of feature vectors corresponds to one of the plurality of time-series; mapping between granularity levels and distance thresholds for clustering, based, at least in part, on a subset of the plurality of feature vectors; generating a plurality of seeds that corresponds to one of the granularity levels, based, at least in part, on the mapping; and assigning each of the plurality of time-series to one of the plurality of seeds. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system comprising:
-
one or more processors; a memory that includes a plurality of computer-executable components, the plurality of computer-executable components comprising a module to; collect a plurality of time-series, wherein each of the plurality of time-series comprises a series of numerical values; generate a plurality of feature vectors, wherein each of the plurality of feature vectors corresponds to one of the plurality of time-series; generate a plurality of seeds that corresponds to a granularity level, based, at least in part, on at least a subset of the plurality of feature vectors; and assigning each of the plurality of time-series to one of the plurality of seeds.
-
-
10. A computer-readable medium storing computer-executable instructions that, when executed, cause one or more processors to perform acts comprising:
-
collecting a plurality of time-series that corresponds to a plurality of computing devices, wherein each of the plurality of time-series comprises a series of numerical values that represents resource consumption by a respective one of the computing devices during a time period; generating a plurality of feature vectors based, at least in part, on a power spectrum of each of the plurality of time-series, wherein each of the plurality of feature vectors corresponds to a respective one of the plurality of time-series; identifying clusters by applying density-based clustering to at least a portion of the plurality of feature vectors; mapping between granularity levels and distance thresholds for each of a plurality of the clusters, based, at least in part, on a subset of the plurality of feature vectors; generating a plurality of seeds that corresponds to one of the granularity levels based, at least in part, on the mapping, wherein the one of the granularity levels is selected based, at least in part, on user input; assigning each of the plurality of time-series to one of the plurality of seeds based, at least in part, on hash values; and presenting a graphic representation of at least one of the plurality of the clusters to indicate resource consumption of at least one of the computing devices during the time period.
-
Specification