Adaptive problem determination and recovery in a computer system
First Claim
1. A computer-based method for providing problem determination and error recovery features to a computing environment, the method comprising:
- receiving information regarding a status of the computing environment;
identifying at least three applicable rules from a knowledge base of rules, wherein the at least three applicable rules are applicable to the status of the computing environment; and
applying the at least three applicable rules to obtain a result,wherein the at least three applicable rules from the knowledge base of rules includes a logging logic rule used by a logging logic module and associated inference engine in specifying that particular events should be logged by system components under particular circumstances, a problem determination logic rule used by a problem determination logic module and associated inference engine in specifying that a presence of particular information contained within event logs indicates a particular problem, and an error recovery logic rule used by an error recovery logic module and associated inference engine in specifying that a particular problem implies a particular solution to the particular problem should be followed.
3 Assignments
0 Petitions
Accused Products
Abstract
A method, computer program product, and data processing system for recognizing, tracing, diagnosing, and repairing problems in an autonomic computing system is disclosed. Rules and courses of actions to follow in logging data, in diagnosing faults (or threats of faults), and in treating faults (or threats of faults) are formulated using an adaptive inference and action system. The adaptive inference and action system includes techniques for conflict resolution that generate, prioritize, modify, and remove rules based on environment-specific information, accumulated time-sensitive data, actions taken, and the effectiveness of those actions. Thus, the present invention enables a dynamic, autonomic computing system to formulate its own strategy for self-administration, even in the face of changes in the configuration of the system.
132 Citations
63 Claims
-
1. A computer-based method for providing problem determination and error recovery features to a computing environment, the method comprising:
-
receiving information regarding a status of the computing environment; identifying at least three applicable rules from a knowledge base of rules, wherein the at least three applicable rules are applicable to the status of the computing environment; and applying the at least three applicable rules to obtain a result, wherein the at least three applicable rules from the knowledge base of rules includes a logging logic rule used by a logging logic module and associated inference engine in specifying that particular events should be logged by system components under particular circumstances, a problem determination logic rule used by a problem determination logic module and associated inference engine in specifying that a presence of particular information contained within event logs indicates a particular problem, and an error recovery logic rule used by an error recovery logic module and associated inference engine in specifying that a particular problem implies a particular solution to the particular problem should be followed. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21)
-
-
10. A computer-based method for providing problem determination and error recovery features to a computing environment, the method comprising:
-
receiving information regarding a status of the computing environment; identifying at least three applicable rules from a knowledge base of rules, wherein the at least three applicable rules are applicable to the status of the computing environment; applying the at least three applicable rules to obtain a result, wherein the at least three applicable rules, utilized by corresponding inference engines includes a logging logic rule specifying that particular events should be logged by system components under particular circumstances, a problem determination logic rule specifying that a presence of particular information contained within event logs indicates a particular problem, and an error recovery logic rule specifying that a particular problem implies a particular solution to the particular problem should be followed, wherein result of the error recovery logic rule is a course of action to follow in resolving a problem; following the course of action to resolve the problem; and in response to following the course of action, determining a degree of success of the course of action. - View Dependent Claims (11)
-
-
22. A computer program product in a computer-readable medium comprising functional descriptive material that, when executed by a computer, enables the computer to perform acts including:
-
receiving information regarding a status of the computing environment; identifying at least three applicable rules from a knowledge base of rules, wherein the at least three applicable rules are applicable to the status of the computing environment; and applying the at least three applicable rules to obtain a result, wherein the at least three applicable rules from the knowledge base of rules includes a logging logic rule used by a logging logic module and associated inference engine in specifying that particular events should be logged by system components under particular circumstances, a problem determination logic rule used by a problem determination logic module and associated inference engine in specifying that a presence of particular information contained within event logs indicates a particular problem, and an error recovery logic rule used by an error recovery logic module and associated inference engine in specifying that a particular problem implies a particular solution to the particular problem should be followed. - View Dependent Claims (23, 24, 25, 26, 27, 28, 29, 30, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42)
-
-
31. A computer program product in a computer-readable medium comprising functional descriptive material that, when executed by a computer, enables the computer to perform acts including:
-
receiving information regarding a status of the computing environment; identifying at least three applicable rules from a knowledge base of rules, wherein the at least three applicable rules are applicable to the status of the computing environment; applying the at least three applicable rules to obtain a result, wherein the at least three applicable rules from the knowledge base of rules, utilized by corresponding inference engines includes a logging logic rule specifying that particular events should be logged by system components under particular circumstances, a problem determination logic rule specifying that a presence of particular information contained within event logs indicates a particular problem, and an error recovery logic rule specifying that a particular problem implies a particular solution to the particular problem should be followed, wherein a result of the error recovery logic rule is a course of action to follow in resolving a problem; following the course of action to resolve the problem; and in response to following the course of action, determining a degree of success of the course of action. - View Dependent Claims (32)
-
-
43. A data processing machine comprising:
-
means for receiving information regarding a status of the computing environment; means for identifying at least three applicable rules from a knowledge base of rules, wherein the at least three applicable rules are applicable to the status of the computing environment; and means for applying the at least three applicable rules to obtain a result, wherein the at least three applicable rules from the knowledge base of rules includes a logging logic rule used by a logging logic module and associated inference engine in specifying that particular events should be logged by system components under particular circumstances, a problem determination logic rule used a problem determination logic module and associated inference engine in specifying that a presence of particular information contained within event logs indicates a particular problem, and an error recovery logic rule used by an error recovery logic module and associated inference engine in specifying that a particular problem implies a particular solution to the particular problem should be followed. - View Dependent Claims (44, 45, 46, 47, 48, 49, 50, 51, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63)
-
-
52. A data processing machine comprising:
-
means for receiving information regarding a status of the computing environment; means for identifying at least three applicable rules from a knowledge base of rules, wherein the at least three applicable rules are applicable to the status of the computing environment; means for applying the at least three applicable rules to obtain a result, wherein the at least three applicable rules from the knowledge base of rules, utilized by corresponding inference engines includes a logging logic rule specifying that particular events should be logged by system components under particular circumstances, a problem determination logic rule specifying that a presence of particular information contained within event logs indicates a particular problem, and an error recovery logic rule specifying that a particular problem implies a particular solution to the particular problem should be followed, wherein a result of the error recovery logic rule is a course of action to follow in resolving a problem; means for following the course of action to resolve the problem; and means, responsive to following the course of action, for determining a degree of success of the course of action. - View Dependent Claims (53)
-
Specification