System and method for statistical application-agnostic fault detection
First Claim
1. A system for providing statistical fault detection for multi-process applications, the system comprising:
- one or more memory locations configured to store said applications executing on a host with a host operating system;
one or more interceptors configured to intercept calls to the host operating system and shared libraries, and configured to generate one or more statistical events based on said intercepted calls;
a statistical fault detector configured to calculate one or more distributions for said one or more statistical events;
one or more additional memory locations configured to store the one or more statistical distributions for each one or more statistical events;
wherein fault detection for said applications is performed by detection of statistically significant deviation of recent events from the corresponding one or more distributions;
wherein the said one or more distributions is one or more of Raw Event Distribution, Temporal Event Distribution, or Spatial Event Distribution;
wherein said Temporal Event Distribution is comprised of a subset of events from the corresponding interceptor subject to a temporal filter; and
wherein said Spatial Event Distribution is comprised of a subset of events from the corresponding interceptor subject to a spatial filter.
3 Assignments
0 Petitions
Accused Products
Abstract
A system, method, and computer readable medium for statistical application-agnostic fault detection of multi-process applications. The computer readable medium includes computer-executable instructions for execution by a processing system. A multi-process application runs on a host. Interceptors collect statistical events and sends said events to a statistical fault detector. The statistical fault detector creates one or more distributions and compares recent statistical event data to historical statistical event data and uses deviation from historical norm for fault detection. The present invention detects faults both within the application and within the environment wherein the application executes, if conditions within the environment cause impaired application performance. The invention also teaches consensus fault detection and elimination of cascading fault notifications based on a hierarchy of events and event groups. Interception and fault detection is transparent to the application, operating system, networking stack and libraries.
-
Citations
20 Claims
-
1. A system for providing statistical fault detection for multi-process applications, the system comprising:
-
one or more memory locations configured to store said applications executing on a host with a host operating system; one or more interceptors configured to intercept calls to the host operating system and shared libraries, and configured to generate one or more statistical events based on said intercepted calls; a statistical fault detector configured to calculate one or more distributions for said one or more statistical events; one or more additional memory locations configured to store the one or more statistical distributions for each one or more statistical events; wherein fault detection for said applications is performed by detection of statistically significant deviation of recent events from the corresponding one or more distributions; wherein the said one or more distributions is one or more of Raw Event Distribution, Temporal Event Distribution, or Spatial Event Distribution; wherein said Temporal Event Distribution is comprised of a subset of events from the corresponding interceptor subject to a temporal filter; and wherein said Spatial Event Distribution is comprised of a subset of events from the corresponding interceptor subject to a spatial filter. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification