Causal ladder mechanism for proactive problem determination, avoidance and recovery
2 Assignments
0 Petitions
Accused Products
Abstract
A plurality of causal ladder is assembled in advance from component system events taken from previous system failures. The ladders classify the various transitions the system goes through from one set of observed states to another in multiple stages representing issues of differing urgency, importance and need for remediation. These stages are used at runtime to determine the criticality of any abnormal system activity and to accurately predict the component failure prior to the system crashing. Each ladder comprises a plurality of elevated stages representing criticality of the problem. At runtime, the causal ladder engine correlates real-time events received from the system to stages of one or more pre-constructed causal ladders and identifies a probable problem (and/or the faulty component) from the corresponding causal ladder. The causal ladder engine also determines the stage of the problem from event occurrences. At each stage, a different potential solution is identified for the problem.
40 Citations
33 Claims
-
1-3. -3. (canceled)
-
4. A method for proactive problem determination and component failure avoidance comprising:
-
assembling a plurality of causal ladders, each of the plurality of causal ladders representing a plurality of seguential stages of criticality of a component problem that resulted in a system failure, wherein assembling a plurality of causal ladders comprises; receiving a plurality of failure events that resulted in a previous failure of a component; identifying a component problem related to an occurrence of each of the plurality of failure events; correlating each of the plurality of failure events to a discrete seguential stage of criticality in a causal ladder representing a respective identified component problem; and assigning at least one remedy to each discrete sequential stage of criticality of the causal ladder representing the respective identified component problem; for each of the plurality of sequential stages of criticality in each of the respective causal ladders, associating at least one remedy for a component problem; identifying a potential component failure from one of the plurality of causal ladders representing a component problem relating to the potential component failure, wherein identifying a potential component failure from one of the plurality of causal ladders comprises; receiving a runtime system event; correlating the received runtime system event to the at least one of the plurality of failure events correlated to a discrete sequential stage of criticality in one of the plurality of causal ladders representing the identified component problem; and determining a criticality of the identified component problem on the correlated discrete sequential stage of criticality in the causal ladder; and determining a problem solution for the component problem based on the one of the plurality of causal ladders representing the component problem, wherein determining a problem solution for the component problem comprises; accessing policy associated with the correlated discrete sequential stage of criticality in the causal ladder; ranking the at least one remedy for the correlated discrete sequential stage of criticality in the causal ladder representing the identified component problem based on policy associated with the correlated discrete sequential stage; and selecting a highest ranking solution for the identified component problem. - View Dependent Claims (5, 6, 7, 8, 9, 10, 21)
-
-
11-20. -20. (canceled)
-
22. A method for proactive problem determination and component failure avoidance comprising:
-
assembling a plurality of causal ladders, each of the plurality of causal ladders representing a plurality of sequential stages of criticality of a component problem that resulted in a system failure, wherein assembling a plurality of causal ladders comprises; receiving a plurality of failure events that resulted in a previous failure of a component; identifying a component problem related to an occurrence of each of the plurality of failure events; correlating each of the plurality of failure events to a discrete sequential stage of criticality in a causal ladder representing a respective identified component problem; and assigning at least one remedy to each discrete sequential stage of criticality of the causal ladder representing the respective identified component problem; for each of the plurality of sequential stages of criticality in each of the respective causal ladders, associating at least one remedy for a component problem; identifying a potential component failure from one of the plurality of causal ladders representing a component problem relating to the potential component failure, wherein identifying a potential component failure from one of the plurality of causal ladders comprises; receiving a runtime system event; correlating the received runtime system event to the at least one of the plurality of failure events correlated to a discrete sequential stage of criticality in one of the plurality of causal ladders representing the identified component problem; and determining a criticality of the identified component problem based on the correlated discrete sequential stage of criticality in the causal ladder; and determining a problem solution for the component problem based on the one of the plurality of causal ladders representing the component problem, wherein determining a problem solution for the component problem comprises; for each remedy assigned to the correlated discrete sequential stage of criticality of the causal ladder; identify all remedies having an estimated repair time that is greater than an estimated time until failure; and excluding the identified remedies from ranking; accessing policy associated with the correlated discrete sequential stage of criticality in the causal ladder; ranking the at least one remedy for the correlated discrete sequential stage of criticality in the causal ladder representing the identified component problem based on policy associated with the correlated discrete sequential stage; and selecting a highest ranking solution for the identified component problem. - View Dependent Claims (23, 24, 25, 26, 27)
-
-
28. A method for proactive problem determination and component failure avoidance comprising:
-
assembling a plurality of causal ladders, each of the plurality of causal ladders representing a plurality of sequential stages of criticality of a component problem that resulted in a system failure, wherein assembling a plurality of causal ladders comprises; receiving a plurality of failure events that resulted in a previous failure of a component; identifying a component problem related to an occurrence of each of the plurality of failure events; correlating each of the plurality of failure events to a discrete sequential stage of criticality in a causal ladder representing a respective identified component problem; and assigning at least one remedy to each discrete sequential stage of criticality of the causal ladder representing the respective identified component problem; for each of the plurality of sequential stages of criticality in each of the respective causal ladders, associating at least one remedy for a component problem; identifying a potential component failure from one of the plurality of causal ladders representing a component problem relating to the potential component failure, wherein identifying a potential component failure from one of the plurality of causal ladders comprises; receiving a runtime system event; correlating the received runtime system event to the at least one of the plurality of failure events correlated to a discrete sequential stage of criticality in one of the plurality of causal ladders representing the identified component problem, wherein correlating the received runtime system event to the at least one of the plurality of failure events correlated to a discrete sequential stage of criticality in one of the plurality of causal ladders representing the identified component problem, further comprises; determining an estimated time until failure for each of the plurality of sequential discrete stages of criticality in each of the plurality of causal ladders from a previous component problem; and associating the respective estimated time until failure with each of the plurality of sequential discrete stages of criticality in each of the plurality of causal ladders; and determining a criticality of the identified component problem based on the correlated discrete sequential stage of criticality in the causal ladder; and determining a problem solution for the component problem based on the one of the plurality of causal ladders representing the component problem, wherein determining a problem solution for the component problem comprises; accessing policy associated with the correlated discrete sequential stage of criticality in the causal ladder; ranking the at least one remedy for the correlated discrete sequential stage of criticality in the causal ladder representing the identified component problem based on policy associated with the correlated discrete sequential stage; and selecting a highest ranking solution for the identified component problem. - View Dependent Claims (29, 30, 31, 32, 33)
-
Specification