Anomaly detection and classification using telemetry data
First Claim
1. One or more computer-readable storage media storing instructions that, when executed by one or more processors, perform operations comprising:
- receiving telemetry data originating from a plurality of client devices;
converting a class of data from the telemetry data to a set of metrics;
aggregating the set of metrics according to a component of interest to obtain values of aggregated metrics over time for the component of interest;
determining a prediction error by comparing the values of the aggregated metrics to a prediction, the prediction being based at least in part on historical telemetry data pertaining to the class of data and the component of interest;
detecting an anomaly based at least in part on the prediction error; and
transmitting an alert message of the anomaly.
1 Assignment
0 Petitions
Accused Products
Abstract
Historical telemetry data can be used to generate predictions for various classes of data at various aggregates of a system that implements an online service. An anomaly detection process can then be utilized to detect anomalies for a class of data at a selected aggregate. An example anomaly detection process includes receiving telemetry data originating from a plurality of client devices, selecting a class of data from the telemetry data, converting the class of data to a set of metrics, aggregating the set of metrics according to a component of interest to obtain values of aggregated metrics over time for the component of interest, determining a prediction error by comparing the values of the aggregated metrics to a prediction, detecting an anomaly based at least in part on the prediction error, and transmitting an alert message of the anomaly to a receiving entity.
-
Citations
20 Claims
-
1. One or more computer-readable storage media storing instructions that, when executed by one or more processors, perform operations comprising:
-
receiving telemetry data originating from a plurality of client devices; converting a class of data from the telemetry data to a set of metrics; aggregating the set of metrics according to a component of interest to obtain values of aggregated metrics over time for the component of interest; determining a prediction error by comparing the values of the aggregated metrics to a prediction, the prediction being based at least in part on historical telemetry data pertaining to the class of data and the component of interest; detecting an anomaly based at least in part on the prediction error; and transmitting an alert message of the anomaly. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A computer-implemented method comprising:
-
receiving telemetry data originating from a plurality of client devices, wherein the telemetry data was generated in response to the plurality of client devices accessing an online service provided by a service provider; generating a set of metrics based on the telemetry data; aggregating the set of metrics according to a component that is used to implement the online service; obtaining values of aggregated metrics over time for the component based at least in part on aggregating the set of metrics; determining a prediction error by comparing the values of the aggregated metrics to a prediction, the prediction being based at least in part on historical telemetry data pertaining to the component; detecting an anomaly based at least in part on the prediction error; and transmitting an alert message of the anomaly. - View Dependent Claims (7, 8, 9, 10, 11, 12)
-
-
13. A system comprising:
-
one or more processors; memory storing computer-executable instructions that, when executed by the one or more processors, cause the one or more processors to perform operations comprising; receiving telemetry data originating from a plurality of client devices; generating a set of metrics based on the telemetry data; aggregating the set of metrics according to a component of interest to obtain values of aggregated metrics over time for the component of interest; determining a prediction error by comparing the values of the aggregated metrics to a prediction, the prediction being based at least in part on historical telemetry data pertaining to the component of interest; detecting an anomaly based at least in part on the prediction error; and initiating an automatic recovery action with respect to the component of interest. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
Specification