Method and System for Detecting Anomalies in Time Series Data
First Claim
1. A computer-implemented method for identifying anomalies in time series data, comprising:
- storing in a database time series data for a data source, wherein the time series data comprises a plurality of time-value pairs, each pair including a value of one or more attributes associated with the data source and a time associated with the value;
for a particular attribute, generating a plurality of forecasting models for characterizing the time-value pairs in a respective subset of the time series data, each forecasting model including an estimated attribute value and an associated error-variance;
for a respective time-value pair associated with the particular attribute;
determining a plurality of differences between the value of the time-value pair and respective estimated attribute values of the plurality of forecasting models; and
tagging the time-value pair as an anomaly if the differences for at least a first subset of the forecasting models are greater than the corresponding error variances; and
in response to a request from a client application for analytics information for the data source, reporting to the client application at least a subset of the time-value pairs tagged as anomalies for one or more of the attributes.
2 Assignments
0 Petitions
Accused Products
Abstract
A server system stores time series data for a data source. The time series data comprises a plurality of time-value pairs, each pair including a value associated with an attribute of the data source and a time. For a particular attribute, the server system generates a plurality of forecasting models for characterizing the time-value pairs, each model including an estimated attribute value and an associated error-variance. For a time-value pair, the server system determines a plurality of differences between the value of the time-value pair and respective estimated attribute values of the plurality of forecasting models and tags the time-value pair as an anomaly if the differences for at least a first subset of the forecasting models are greater than the corresponding error variances. In response to a request from a client application, the server system returns at least a subset of the time-value pairs tagged as anomalies.
175 Citations
32 Claims
-
1. A computer-implemented method for identifying anomalies in time series data, comprising:
-
storing in a database time series data for a data source, wherein the time series data comprises a plurality of time-value pairs, each pair including a value of one or more attributes associated with the data source and a time associated with the value; for a particular attribute, generating a plurality of forecasting models for characterizing the time-value pairs in a respective subset of the time series data, each forecasting model including an estimated attribute value and an associated error-variance; for a respective time-value pair associated with the particular attribute; determining a plurality of differences between the value of the time-value pair and respective estimated attribute values of the plurality of forecasting models; and tagging the time-value pair as an anomaly if the differences for at least a first subset of the forecasting models are greater than the corresponding error variances; and in response to a request from a client application for analytics information for the data source, reporting to the client application at least a subset of the time-value pairs tagged as anomalies for one or more of the attributes. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A server system for identifying anomalies in time series data, comprising:
-
one or more processors for executing programs; and memory to store data and to store one or more programs to be executed by the one or more processors, the one or more programs including instructions for; storing in a database time series data for a data source, wherein the time series data comprises a plurality of time-value pairs, each pair including a value of one or more attributes associated with the data source and a time associated with the value; for a particular attribute, generating a plurality of forecasting models for characterizing the time-value pairs in a respective subset of the time series data, each forecasting model including an estimated attribute value and an associated error-variance;
for a respective time-value pair associated with the particular attribute;determining a plurality of differences between the value of the time-value pair and respective estimated attribute values of the plurality of forecasting models; and tagging the time-value pair as an anomaly if the differences for at least a first subset of the forecasting models are greater than the corresponding error variances; and in response to a request from a client application for analytics information for the data source, reporting to the client application at least a subset of the time-value pairs tagged as anomalies for one or more of the attributes. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
16. A computer readable-storage medium storing one or more programs for execution by one or more processors of a server system for identifying anomalies in time series data, the one or more programs comprising instructions for:
-
storing in a database time series data for a data source, wherein the time series data comprises a plurality of time-value pairs, each pair including a value of one or more attributes associated with the data source and a time associated with the value; for a particular attribute, generating a plurality of forecasting models for characterizing the time-value pairs in a respective subset of the time series data, each forecasting model including an estimated attribute value and an associated error-variance; for a respective time-value pair associated with the particular attribute; determining a plurality of differences between the value of the time-value pair and respective estimated attribute values of the plurality of forecasting models; and tagging the time-value pair as an anomaly if the differences for at least a first subset of the forecasting models are greater than the corresponding error variances; and in response to a request from a client application for analytics information for the data source, reporting to the client application at least a subset of the time-value pairs tagged as anomalies for one or more of the attributes. - View Dependent Claims (17, 18, 19, 20, 21, 22)
-
-
23. A computer system for detecting events of interest in time series data, comprising:
-
one or more processors for executing programs; and memory to store data and to store one or more programs to be executed by the one or more processors, the one or more programs including; a time series data collection module configured to collect time series data at one or more predefined time intervals from a plurality of data sources, wherein the time series data comprises a plurality of time-value pairs, each pair including a value of one attribute associated with the data sources and a time during which the value was collected; a time series storage module configured to store in the memory the collected time series data such that a time-value pair is added to the stored time series data for a respective collection of time series data without disturbing the stored time series data for the respective collection; an event detection module configured to determine whether a new time-value pair is an event of interest with reference to its associated collection of time series data, including one or more sub-modules for; generating a plurality of forecasting models for characterizing different subsets of the collection of time series data, each forecasting model including an estimated attribute value and an associated error-variance; determining whether the new time-value pair is within a scope defined by the estimated attribute value and the error-variance for each of the plurality of forecasting models; and tagging the new time-value pair if the new time-value pair is outside the respective scopes for at least a first subset of the forecasting models; and an event storage module configured to store the tagged time-value pairs such that the tagged time-value pairs are ready to be served in response to a request for events of interest from a client application. - View Dependent Claims (24, 25, 26)
-
-
27. A computer-implemented method for detecting events of interest in time series data, comprising:
-
collecting time series data at one or more predefined time intervals from a plurality of data sources, wherein the time series data comprises a plurality of time-value pairs, each pair including a value of one attribute associated with the data sources and a time during which the value was collected; storing in memory the collected time series data such that a time-value pair is added to the stored time series data for a respective collection of time series data without disturbing the stored time series data for the respective collection; determining whether a new time-value pair is an event of interest with reference to its associated collection of time series data, further including; generating a plurality of forecasting models characterizing different subsets of the associated collection of time series data, each forecasting model including an estimated attribute value and an associated error-variance; determining whether the particular new time-value pair is within the associated error-variance for each of the plurality of forecasting models; and tagging the particular time-value pair as an anomaly when the value of the particular time-value pair is outside the error-variance for at least a first subset of the forecasting models; and storing the time-value pairs tagged as anomalies such that the stored time-value pairs are ready to be served to a user at a client application in response to a user request for the anomalies. - View Dependent Claims (28, 29, 30)
-
-
31. A computer readable-storage medium storing one or more programs for execution by one or more processors of a server system for identifying anomalies in time series data, the one or more programs comprising instructions for:
-
collecting time series data at one or more predefined time intervals from a plurality of data sources, wherein the time series data comprises a plurality of time-value pairs, each pair including a value of one attribute associated with the data sources and a time when the value was collected; storing in a computer memory the collected time series data such that, when a new time-value pair is collected by the time series data collector, the new time-value pair is added to the stored time series data for a respective collection of time series data without disturbing the previously stored time series data for the respective collection; determining for a particular new time-value pair whether the particular new time-value pair is an anomaly with reference to its associated collection of time series data, including; generating a plurality of forecasting models characterizing different subsets of the associated collection of time series data, each forecasting model including an estimated attribute value and an associated error-variance; determining whether the particular new time-value pair is within the associated error-variance for each of the plurality of forecasting models; and tagging the particular time-value pair as an anomaly when the value of the particular time-value pair is outside the error-variance for at least a first subset of the forecasting models; and storing the time-value pairs tagged as anomalies such that the stored time-value pairs are ready to be served to a user at a client application in response to a user request for the anomalies. - View Dependent Claims (32)
-
Specification