Strategies for identifying anomalies in time-series data
First Claim
1. A computerized method for detecting one or more anomalies in time-series data, comprising:
- collecting time-series data from an environment to provide collected time-series data, the collected time-series data having a plurality of portions;
dividing the collected time-series data into a plurality of collected data segments;
fitting a plurality of local models to the respective plurality of collected data segments, the plurality local models collectively forming a global model; and
determining whether there is at least one anomaly in the collected time-series data or no anomalies based on a comparison between the collected time-series data and the global model,wherein the fitting selects a type of model-fitting paradigm to be applied to the collected time-series data to generate the plurality of local models on a portion-by-portion basis, wherein the fitting selects the type of model-fitting paradigm based on an error value metric, the error value metric corresponding to a difference between a point in the collected time-series data and a corresponding model point, wherein the fitting selects a first model-fitting paradigm that relies on an absolute value (L1) measure of the error value metric when a portion of the collected time-series data under consideration is considered anomalous, wherein the fitting selects another model-fitting paradigm that relies on a squared-term (L2) measure of the error value metric when the portion under consideration is considered normal.
2 Assignments
0 Petitions
Accused Products
Abstract
A strategy is described for identifying anomalies in time-series data. The strategy involves dividing the time-series data into a plurality of collected data segments and then using a modeling technique to fit local models to the collected data segments. Large deviations of the time-series data from the local models are indicative of anomalies. In one approach, the modeling technique can use an absolute value (L1) measure of error value for all of the collected data segments. In another approach, the modeling technique can use the L1 measure for only those portions of the time-series data that are projected to be anomalous. The modeling technique can use a squared-term (L2) measure of error value for normal portions of the time-series data. In another approach, the modeling technique can use an iterative expectation-maximization strategy in applying the L1 and L2 measures.
70 Citations
17 Claims
-
1. A computerized method for detecting one or more anomalies in time-series data, comprising:
-
collecting time-series data from an environment to provide collected time-series data, the collected time-series data having a plurality of portions; dividing the collected time-series data into a plurality of collected data segments; fitting a plurality of local models to the respective plurality of collected data segments, the plurality local models collectively forming a global model; and determining whether there is at least one anomaly in the collected time-series data or no anomalies based on a comparison between the collected time-series data and the global model, wherein the fitting selects a type of model-fitting paradigm to be applied to the collected time-series data to generate the plurality of local models on a portion-by-portion basis, wherein the fitting selects the type of model-fitting paradigm based on an error value metric, the error value metric corresponding to a difference between a point in the collected time-series data and a corresponding model point, wherein the fitting selects a first model-fitting paradigm that relies on an absolute value (L1) measure of the error value metric when a portion of the collected time-series data under consideration is considered anomalous, wherein the fitting selects another model-fitting paradigm that relies on a squared-term (L2) measure of the error value metric when the portion under consideration is considered normal. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A computerized method for detecting one or more anomalies in time-series data, comprising:
-
collecting time-series data from an environment to provide collected time-series data, the collected time-series data having a plurality of portions; dividing the collected time-series data into a plurality of collected data segments; labeling portions of the collected time-series data as either anomalous or normal; fitting a plurality of local models to the respective plurality of collected data segments, the plurality of local models collectively forming a global model, wherein the fitting uses a first model-fitting paradigm for any portion of the time-series data that is considered anomalous and a second model-fitting paradigm for any portion of the time-series data that is considered normal, wherein the first model-fitting paradigm involves using an absolute value (L1) measure to represent an error value metric, and wherein the second model-fitting paradigm involves using a squared-term (L2) measure to represent the error value metric, the error value metric corresponding to a difference between a point in the collected time-series data and a corresponding model point; and determining whether there is at least one anomaly in the collected time-series data or no anomalies based on a comparison between the collected time-series data and the global model. - View Dependent Claims (14, 15, 16)
-
-
17. An analysis system for detecting one or more anomalies in time-series data, comprising:
-
a data receiving module configured to collect time-series data from an environment to provided collected time-series data; an anomaly analysis module configured to; divide the collected time-series data into a plurality of collected data segments, the collected time-series data having a plurality of portions; fit a plurality of local models to the respective plurality of collected data segments using a plurality of different model-fitting paradigms, the plurality of local models collectively forming a global model; and identify at least one anomaly in the collected time-series data based on a comparison between the collected time-series data and the global model, to thereby provide an output result, wherein the fitting selects the plurality of different model-fitting paradigms to achieve a desired combination of accuracy and computational processing speed, wherein the fitting uses a first model-fitting paradigm for any portion of the time-series data that is considered anomalous and a second model-fitting paradigm for any portion of the time-series data that is considered normal, wherein the first model-fitting paradigm involves using an absolute value (L1) measure to represent an error value metric, and wherein the second model-fitting paradigm involves using a squared-term (L2) measure to represent the error value metric, the error value metric corresponding to a difference between a point in the collected time-series data and a corresponding model point; and an output module configured to provide the output result.
-
Specification