Method and system for detecting anomalies in time series data
First Claim
1. A computer-implemented method for identifying significant events in time series data, the method comprising:
- storing in a database time series data for a data source, wherein the time series data comprises a plurality of time-value pairs, each pair including a value of one or more attributes associated with the data source and a time associated with the value;
for a particular attribute, generating a plurality of forecasting models for characterizing the time-value pairs, each forecasting model including an estimated attribute value and a corresponding error-variance; and
for a time-value pair associated with the particular attribute;
determining a plurality of differences between the value of the time-value pair and the attribute values estimated by the plurality of forecasting models;
determining a significance factor such that each of the plurality of differences for at least a subset of the forecasting models is smaller than the corresponding error-variance multiplied by the significance factor; and
identifying the time-value pair as a significant event in response to a determination that the significance factor exceeds a significance threshold for the particular attribute.
2 Assignments
0 Petitions
Accused Products
Abstract
A server system stores time series data for a data source. The time series data comprises a plurality of time-value pairs, each pair including a value associated with an attribute of the data source and a time. For a particular attribute, the server system generates a plurality of forecasting models for characterizing the time-value pairs, each model including an estimated attribute value and an associated error-variance. For a time-value pair, the server system determines a plurality of differences between the value of the time-value pair and respective estimated attribute values of the plurality of forecasting models and tags the time-value pair as an anomaly if the differences for at least a first subset of the forecasting models are greater than the corresponding error variances. In response to a request from a client application, the server system returns at least a subset of the time-value pairs tagged as anomalies.
76 Citations
20 Claims
-
1. A computer-implemented method for identifying significant events in time series data, the method comprising:
-
storing in a database time series data for a data source, wherein the time series data comprises a plurality of time-value pairs, each pair including a value of one or more attributes associated with the data source and a time associated with the value; for a particular attribute, generating a plurality of forecasting models for characterizing the time-value pairs, each forecasting model including an estimated attribute value and a corresponding error-variance; and for a time-value pair associated with the particular attribute; determining a plurality of differences between the value of the time-value pair and the attribute values estimated by the plurality of forecasting models; determining a significance factor such that each of the plurality of differences for at least a subset of the forecasting models is smaller than the corresponding error-variance multiplied by the significance factor; and identifying the time-value pair as a significant event in response to a determination that the significance factor exceeds a significance threshold for the particular attribute. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system for identifying significant events in time series data, the system comprising:
a processing circuit comprising one or more processors and one or more memory devices, wherein the processing circuit is configured to; store in a database time series data for a data source, wherein the time series data comprises a plurality of time-value pairs, each pair including a value of one or more attributes associated with the data source and a time associated with the value; for a particular attribute, generate a plurality of forecasting models for characterizing the time-value pairs, each forecasting model including an estimated attribute value and a corresponding error-variance; and for a time-value pair associated with the particular attribute; determine a plurality of differences between the value of the time-value pair and the attribute values estimated by the plurality of forecasting models; determine a significance factor such that each of the plurality of differences for at least a subset of the forecasting models is smaller than the corresponding error-variance multiplied by the significance factor; and identify the time-value pair as a significant event in response to a determination that the significance factor exceeds a significance threshold for the particular attribute. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
15. A non-transitory computer-readable storage medium storing one or more programs for execution by one or more processors of a system for identifying significant events in time series data, the one or more programs comprising instructions for:
-
storing in a database time series data for a data source, wherein the time series data comprises a plurality of time-value pairs, each pair including a value of one or more attributes associated with the data source and a time associated with the value; for a particular attribute, generating a plurality of forecasting models for characterizing the time-value pairs, each forecasting model including an estimated attribute value and a corresponding error-variance; and for a time-value pair associated with the particular attribute; determining a plurality of differences between the value of the time-value pair and the attribute values estimated by the plurality of forecasting models; determining a significance factor such that each of the plurality of differences for at least a subset of the forecasting models is smaller than the corresponding error-variance multiplied by the significance factor; and identifying the time-value pair as a significant event in response to a determination that the significance factor exceeds a significance threshold for the particular attribute. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification