Segment-based change detection method in multivariate data stream
First Claim
1. A method for detecting a change from a normal state in a multivariate data stream, the method comprising the steps of:
- (A) receiving a multidimensional training data stream including a plurality of data points representing a normal state;
(B) sampling a plurality of m segment windows in the training data stream, each segment window representing a predefined time interval in the data stream;
(C) for each of said plurality of segment windows, summarizing a distribution of data points in the segment window by constructing a training histogram h;
(D) representing each training histogram hi by a set of distribution representatives ri using clustering;
(E) receiving a test data stream including a plurality of multidimensional data points;
(F) sampling data points in a plurality of segment windows in the test data stream, each segment window representing a predefined time interval in the data stream;
(G) for each segment window in the test data, summarizing a distribution of data points in the segment window by constructing a test histogram h′
;
(H) representing each test histogram h′
by a set of distribution representatives r′
using clustering;
(I) comparing test histograms h′
with training histograms hi using the distribution representatives to find closest matches using a similarity measure S; and
(J) transmitting an indication that the multivariate data stream contains a change from the normal state for those segments having a similarity measure S indicating a similarity lower than a decision threshold σ
.
5 Assignments
0 Petitions
Accused Products
Abstract
A method and framework are described for detecting changes in a multivariate data stream. A training set is formed by sampling time windows in a data stream containing data reflecting normal conditions. A histogram is created to summarize each window of data, and data within the histograms are clustered to form test distribution representatives to minimize the bulk of training data. Test data is then summarized using histograms representing time windows of data and data within the test histograms are clustered. The test histograms are compared to the training histograms using nearest neighbor techniques on the clustered data. Distances from the test histograms to the test distribution representatives are compared to a threshold to identify anomalies.
9 Citations
20 Claims
-
1. A method for detecting a change from a normal state in a multivariate data stream, the method comprising the steps of:
-
(A) receiving a multidimensional training data stream including a plurality of data points representing a normal state; (B) sampling a plurality of m segment windows in the training data stream, each segment window representing a predefined time interval in the data stream; (C) for each of said plurality of segment windows, summarizing a distribution of data points in the segment window by constructing a training histogram h; (D) representing each training histogram hi by a set of distribution representatives ri using clustering; (E) receiving a test data stream including a plurality of multidimensional data points; (F) sampling data points in a plurality of segment windows in the test data stream, each segment window representing a predefined time interval in the data stream; (G) for each segment window in the test data, summarizing a distribution of data points in the segment window by constructing a test histogram h′
;(H) representing each test histogram h′
by a set of distribution representatives r′
using clustering;(I) comparing test histograms h′
with training histograms hi using the distribution representatives to find closest matches using a similarity measure S; and(J) transmitting an indication that the multivariate data stream contains a change from the normal state for those segments having a similarity measure S indicating a similarity lower than a decision threshold σ
. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 15)
-
-
11. A computer-usable medium having computer readable instructions stored thereon for execution by a processor to perform a method for detecting a change from a normal state in a multivariate data stream, the method comprising the steps of:
-
(A) receiving a multidimensional training data stream including a plurality of data points representing a normal state; (B) sampling a plurality of m segment windows in the training data stream, each segment window representing a predefined time interval in the data stream; (C) for each of said plurality of segment windows, summarizing a distribution of data points in the segment window by constructing a training histogram h; (D) representing each training histogram hi by a set of distribution representatives ri using clustering; (E) receiving a test data stream including a plurality of multidimensional data points; (F) sampling data points in a plurality of segment windows in the test data stream, each segment window representing a predefined time interval in the data stream; (G) for each segment window in the test data, summarizing a distribution of data points in the segment window by constructing a test histogram h′
;(H) representing each test histogram h′
by a set of distribution representatives r′
using clustering;(I) comparing test histograms h′
with training histograms hi using the distribution representatives to find closest matches using a similarity measure S; and(J) transmitting an indication that the multivariate data stream contains a change from the normal state for those segments having a similarity measure S indicating a similarity lower than a decision threshold σ
. - View Dependent Claims (12, 13, 14, 16, 17, 18, 19, 20)
-
Specification