Fingerprinting event logs for system management troubleshooting
First Claim
1. A computer-implemented method for troubleshooting configuration errors, comprising:
- obtaining a log of events which are generated as a program executes in a learning process;
analyzing events of the learning process to identify recurring event sequences, each recurring event sequence includes a time sequence of multiple events and occurs a number of times above a predefined threshold;
generating rules based on the recurring event sequences;
obtaining a log of events which are generated as the program executes in a detection process;
applying the rules to events of the detection process to determine if a sequence of events of the detection process violates at least one of the rules; and
determining that an error has been detected in the program, and reporting an alarm, when the sequence of the events of the detection process violates at least one of the rules.
2 Assignments
0 Petitions
Accused Products
Abstract
A technique for automatically detecting and correcting configuration errors in a computing system. In a learning process, recurring event sequences, including e.g., registry access events, are identified from event logs, and corresponding rules are developed. In a detecting phase, the rules are applied to detected event sequences to identify violations and to recover from failures. Event sequences across multiple hosts can be analyzed. The recurring event sequences are identified efficiently by flattening a hierarchical sequence of the events such as is obtained from the Sequitur algorithm. A trie is generated from the recurring event sequences and edges of nodes of the trie are marked as rule edges or non-rule edges. A rule is formed from a set of nodes connected by rule edges. The rules can be updated as additional event sequences are analyzed. False positive suppression policies include a violation-consistency policy and an expected event disappearance policy.
53 Citations
18 Claims
-
1. A computer-implemented method for troubleshooting configuration errors, comprising:
-
obtaining a log of events which are generated as a program executes in a learning process; analyzing events of the learning process to identify recurring event sequences, each recurring event sequence includes a time sequence of multiple events and occurs a number of times above a predefined threshold; generating rules based on the recurring event sequences; obtaining a log of events which are generated as the program executes in a detection process; applying the rules to events of the detection process to determine if a sequence of events of the detection process violates at least one of the rules; and determining that an error has been detected in the program, and reporting an alarm, when the sequence of the events of the detection process violates at least one of the rules. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A computer-implemented method for troubleshooting configuration errors, comprising:
-
obtaining a log of events which are generated as a program executes, the log of events includes registry events regarding registry key accesses; applying rules to the events of the log to determine if a sequence of events violates at least one of the rules; when the sequence of events violates at least one of the rules, determining if the registry key involved in a violating event or an expected event has been one of;
(a) deleted and (b) modified; andrestoring the registry key to an expected key if the registry key has been one of;
(a) deleted and (b) modified, the rules identify the expected key. - View Dependent Claims (14, 15)
-
-
16. Computer storage media having computer-readable software embodied thereon for programming at least one processor to perform a method for anomaly detection, the method comprising:
-
obtaining a log of events which are generated in a learning process; analyzing events of the learning process to identify recurring event sequences, each recurring event sequence includes a time sequence of multiple events; generating rules based on the recurring event sequences; obtaining a log of events which are generated in a detection process, the log of events of the learning process and the log of events of the detection process comprise registry events regarding registry key accesses; applying the rules to events of the detection process to determine if a sequence of events of the detection process violates at least one of the rules; determining that an error has been detected in the program, and reporting an alarm, when the sequence of the events of the detection process violates at least one of the rules; when the sequence of the events of the detection process violates at least one of the rules, determining if the registry key involved in either a violating event or an expected event has been one of;
(a) deleted and (b) modified; andrestoring the registry key of either the violating event or expected event to an expected key if the registry key has been one of;
(a) deleted and (b) modified, the rules identify the expected key. - View Dependent Claims (17, 18)
-
Specification