Apparatus and method for event correlation and problem reporting
First Claim
1. A method for detecting problems in a system which generates a plurality of symptoms, the method comprising the steps of:
- (1) providing a computer-accessible codebook comprising a matrix of values each corresponding to a mapping between one of said plurality of symptoms and one of a plurality of likely problems in said system;
(2) monitoring a plurality of symptom data values representing said plurality of symptoms generated by said system over time;
(3) determining a mismatch measure between each of a plurality of groups of said values in said codebook and said plurality of symptom data values through the use of a computer, and selecting one of said plurality of likely problems corresponding to one of said plurality of groups having the smallest mismatch measure; and
(4) generating a report comprising said one selected likely problem from said codebook.
5 Assignments
0 Petitions
Accused Products
Abstract
An apparatus and method is provided for efficiently determining the source of problems in a complex system based on observable events. The problem identification process is split into two separate activities of (1) generating efficient codes for problem identification and (2) decoding the problems at runtime. Various embodiments of the invention contemplate creating a causality matrix which relates observable symptoms to likely problems in the system, reducing the causality matrix into a minimal codebook by eliminating redundant or unnecessary information, monitoring the observable symptoms, and decoding problems by comparing the observable symptoms against the minimal codebook using various best-fit approaches. The minimal codebook also identifies those observable symptoms for which the greatest benefit will be gained if they were monitored as compared to others. By defining a distance measure between symptoms and codes in the codebook, the invention can tolerate a loss of symptoms or spurious symptoms without failure. Changing the radius of the codebook allows the ambiguity of problem identification to be adjusted easily. The invention also allows probabilistic and temporal correlations to be monitored.
-
Citations
80 Claims
-
1. A method for detecting problems in a system which generates a plurality of symptoms, the method comprising the steps of:
-
(1) providing a computer-accessible codebook comprising a matrix of values each corresponding to a mapping between one of said plurality of symptoms and one of a plurality of likely problems in said system; (2) monitoring a plurality of symptom data values representing said plurality of symptoms generated by said system over time; (3) determining a mismatch measure between each of a plurality of groups of said values in said codebook and said plurality of symptom data values through the use of a computer, and selecting one of said plurality of likely problems corresponding to one of said plurality of groups having the smallest mismatch measure; and (4) generating a report comprising said one selected likely problem from said codebook. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21)
-
-
22. A method for detecting problems in a system which generates a plurality of symptoms, the method comprising the steps of:
-
(1) generating a causality matrix comprising a first matrix of values each corresponding to a mapping between one of said plurality of symptoms and one of a plurality of likely problems in said system; (2) reducing said causality matrix into a codebook comprising a second matrix of values fewer in number than said first matrix of values by eliminating duplicative sets of values from said first matrix of values; (3) monitoring a plurality of symptom data values representing said plurality of symptoms generated by said system over time; (4) determining a mismatch measure between each of a plurality of groups of said values in said codebook and said plurality of symptom data values through the use of a computer, and selecting one of said plurality of likely problems corresponding to one of said plurality of groups having the smallest mismatch measure; and (5) reporting said one selected likely problem from said codebook. - View Dependent Claims (23, 24, 25, 26, 27, 28)
-
-
29. A method of generating a codebook for use in a process of detecting problems in a system which generates a plurality of symptoms, the method comprising the steps of:
-
(1) preparing a causality matrix comprising a matrix of values each corresponding to a mapping between one of said plurality of symptoms and one of a plurality of likely problems in said system; (2) making said causality matrix well-formed by deleting redundant sets of values from said matrix of values; (3) selecting a desired degree of distinction between groups of said plurality of symptoms, each group corresponding to a different likely problem; (4) generating, through the use of a computer, an optimal codebook from said well-formed causality matrix by selecting minimal groups of symptoms from said well-formed causality matrix such that selected groups of symptoms corresponding to any two likely problems satisfy the desired degree of distinction; and (5) storing said optimal codebook in a computer storage device. - View Dependent Claims (30, 31, 32, 33, 34)
-
-
35. A method of generating a codebook for use in a process of detecting problems in a system which generates a plurality of symptoms, the method comprising steps of:
-
(1) preparing a causality graph comprising a plurality of nodes each corresponding to a problem or a symptom, and a plurality of directed edges each pointing from one of the plurality of nodes to another of the plurality of nodes, and corresponding to a causal relation between two or more of said plurality of nodes; (2) making said causality graph well-formed by deleting redundant nodes; (3) selecting a desired degree of distinction between groups of said plurality of symptoms, each group corresponding to a different likely problem; (4) generating, through the use of a computer, an optimal codebook from said well-formed causality graph by selecting a minimal set of symptom nodes such that the minimal set of symptom nodes caused by any two problem nodes satisfy the desired degree of distinction; and (5) storing said optimal codebook in a computer storage device. - View Dependent Claims (36, 37, 38, 39)
-
-
40. Apparatus for detecting problems in a system which generates a plurality of symptoms, the apparatus comprising:
-
a storage device for storing a codebook comprising a matrix of values each corresponding to a mapping between one of said plurality of symptoms and one of a plurality of likely problems in said system; monitoring means for monitoring a plurality of symptom data values representing said plurality of symptoms generated by said system over time; means for determining a mismatch measure between each of a plurality of groups of said values in said codebook and said plurality of symptom data values, and selecting one of said plurality of likely problems corresponding to one of said plurality of groups having the smallest mismatch measure; and generating means for generating a report comprising said one selected likely problem. - View Dependent Claims (41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60)
-
-
61. Apparatus for detecting problems in a system which generates a plurality of symptoms, the apparatus comprising:
-
generating means for generating a causality matrix comprising a first matrix of values each corresponding to a mapping between one of said plurality of symptoms and one of a plurality of likely problems in said system; reducing means for reducing said causality matrix into a computer-accessible codebook comprising a second matrix of values fewer in number than said first matrix of values by eliminating duplicative sets of values from said first matrix; monitoring means for monitoring, through the use of a computer, a plurality of symptom data values representing said plurality of symptoms generated by said system over time; means for determining a mismatch measure between each of a plurality of groups of said values in said codebook and said plurality of symptom data values, and selecting one of said plurality of likely problems corresponding to one of said plurality of groups having the smallest mismatch measure; and a report generator for reporting said one selected likely problem. - View Dependent Claims (62, 63, 64, 65, 66, 67)
-
-
68. Apparatus for generating a codebook for use in detecting problems in a system which generates a plurality of symptoms, the apparatus comprising:
-
preparing means for preparing a causality matrix comprising a matrix of values each corresponding to a mapping between one of said plurality of symptoms and one of a plurality of likely problems in said system; means for making said causality matrix well-formed by deleting redundant sets of values from said matrix of values; inputting means for inputting a desired degree of distinction between groups of said plurality of symptoms, each group corresponding to a different likely problem; generating means for generating a computer-accessible optimal codebook from said well-formed causality matrix by selecting minimal groups of symptoms from said well-formed causality matrix such that selected groups of symptoms corresponding to any two likely problems satisfy the desired degree of distinction; and a storage device for storing said computer-accessible optimal codebook. - View Dependent Claims (69, 70, 71, 72, 73)
-
-
74. Apparatus for generating a codebook for use in detecting problems in a system which generates a plurality of symptoms, the apparatus comprising:
-
preparing means for preparing a causality graph comprising a plurality of nodes each corresponding to a problem or a symptom, and a plurality of directed edges each pointing from one of the plurality of nodes to another of the plurality of nodes and corresponding to a causal relation between two or more of said plurality of nodes; means for making said causality graph well-formed by deleting redundant nodes; specifying means for specifying a desired degree of distinction between groups of said plurality of symptoms, each group corresponding to a different likely problem; generating means for generating, through the use of a computer, an optimal codebook from said well-formed causality graph by selecting a minimal group of symptom nodes such that selected groups satisfy the degree of distinction; and a computer storage device for storing said optimal codebook. - View Dependent Claims (75, 76, 77, 78, 79)
-
-
80. A method of preparing a data structure for use in detecting or isolating problems in a system having a plurality of components of various classes, said system generating a plurality of observable events, the method comprising the steps of:
-
(1) preparing compilable statements which define, for each class of component in said system, (a) the relationships the component can participate in with respect to other classes of components; (b) problems associated with the component and events caused therefrom; (c) a causal propagation of the problems and the events to other components along the relationships; (2) preparing a configuration specification which defines component instances in the system, their classes and their relationships to other components; (3) translating, through the use of a computer, the compilable statements and the configuration specification into the data structure by determining a causality closure of the events in the system; and (4) storing the data structure in a computer storage device.
-
Specification