Active probing for real-time diagnosis
First Claim
1. A method for diagnosing a problem associated with a computing system, the method comprising the steps of:
- executing one or more probes in accordance with at least a portion of a previously selected probe schedule;
when a result of one or more of the probes of the previously selected probe schedule indicates, at least, a potential problem associated with the computing system, selecting in real-time one or more probes which optimize at least one criterion; and
executing the one or more selected probes so as to diagnose the potential problem.
3 Assignments
0 Petitions
Accused Products
Abstract
Improved problem diagnosis techniques for use in accordance with computing systems, e.g., distributed computing systems, are disclosed. In one aspect of the invention, a technique for diagnosing a problem associated with a computing system comprises the following steps/operations. One or more probes are executed in accordance with at least a portion of a previously selected probe schedule. When a result of one or more of the probes of the previously selected probe schedule indicates, at least, a potential problem associated with the computing system, one or more probes which optimize at least one criterion are selected in real-time. The one or more selected probes are executed so as to diagnose the potential problem.
-
Citations
22 Claims
-
1. A method for diagnosing a problem associated with a computing system, the method comprising the steps of:
-
executing one or more probes in accordance with at least a portion of a previously selected probe schedule;
when a result of one or more of the probes of the previously selected probe schedule indicates, at least, a potential problem associated with the computing system, selecting in real-time one or more probes which optimize at least one criterion; and
executing the one or more selected probes so as to diagnose the potential problem.
-
-
2. The method of claim 1, wherein the step of selecting in real-time one or more probes which optimize at least one criterion further comprises the step of selecting in real-time one or more probes which maximize information gain relating to the potential problem.
-
3. The method of claim 1, further comprising the step of analyzing results of the execution of the one or more selected probes using a probabilistic inference.
-
4. The method of claim 3, wherein the step of analyzing results of the execution of the one or more selected probes using a probabilistic inference further comprises the step of analyzing results of the execution of the one or more selected probes using a Bayesian network.
-
5. The method of claim 3, wherein the step of analyzing results of the execution of the one or more selected probes using a probabilistic inference further comprises the step of analyzing results of the execution of the one or more selected probes using one or more prior fault probabilities for one or more system components.
-
6. The method of claim 3, further comprising the step of repeating the step of selecting in real-time one or more probes which optimize at least one criterion and the step of analyzing results of the execution of the one or more selected probes until the a particular level of diagnostic confidence is reached.
-
7. The method of claim 1, further comprising the step of preselecting sets of probes to be executed.
-
8. The method of claim 7, wherein the step of preselecting sets of probes to be executed further comprises the step of preselecting a problem detection probe set (DPS) and a problem localization probe set (LPS) to be executed, wherein probes of the DPS are intended to cover any problem and probes of the LPS are intended to localize a problem detected by a probe of the DPS.
-
9. A method for diagnosing a problem associated with a computing system, the method comprising the steps of:
-
selecting online one or more probes which optimize at least one criterion, when a result of an execution of one or more probes of at least a portion of a previously selected probe schedule indicates, at least, a potential problem associated with the computing system; and
executing the one or more selected probes so as to diagnose the potential problem.
-
-
10. Apparatus for diagnosing a problem associated with a computing system, the apparatus comprising:
-
a memory; and
at least one processor coupled to the memory and operative to;
(i) execute one or more probes in accordance with at least a portion of a previously selected probe schedule;
(ii) when a result of one or more of the probes of the previously selected probe schedule indicates, at least, a potential problem associated with the computing system, select in real-time one or more probes which optimize at least one criterion; and
(iii) execute the one or more selected probes so as to diagnose the potential problem.
-
-
11. The apparatus of claim 10, wherein the operation of selecting in real-time one or more probes which optimize at least one criterion further comprises the operation of selecting in real-time one or more probes which maximize information gain relating to the potential problem.
-
12. The apparatus of claim 10, wherein the at least one processor is further operative to analyze results of the execution of the one or more selected probes using a probabilistic inference.
-
13. The apparatus of claim 12, wherein the operation of analyzing results of the execution of the one or more selected probes using a probabilistic inference further comprises the operation of analyzing results of the execution of the one or more selected probes using a Bayesian network.
-
14. The apparatus of claim 12, wherein the operation of analyzing results of the execution of the one or more selected probes using a probabilistic inference further comprises the operation of analyzing results of the execution of the one or more selected probes using one or more prior fault probabilities for one or more system components.
-
15. The apparatus of claim 12, wherein the at least one processor is further operative to repeat the operation of selecting in real-time one or more probes which optimize at least one criterion and the operation of analyzing results of the execution of the one or more selected probes until the a particular level of diagnostic confidence is reached.
-
16. The apparatus of claim 10, wherein the at least one processor is further operative to preselect sets of probes to be executed.
-
17. The apparatus of claim 16, wherein the operation of preselecting sets of probes to be executed further comprises the operation of preselecting a problem detection probe set (DPS) and a problem localization probe set (LPS) to be executed, wherein probes of the DPS are intended to cover any problem and probes of the LPS are intended to localize a problem detected by a probe of the DPS.
-
18. Apparatus for diagnosing a problem associated with a computing system, the apparatus comprising:
-
a memory; and
at least one processor coupled to the memory and operative to;
(i) select online one or more probes which optimize at least one criterion, when a result of an execution of one or more probes of at least a portion of a previously selected probe schedule indicates, at least, a potential problem associated with the computing system; and
(ii) execute the one or more selected probes so as to diagnose the potential problem.
-
-
19. An article of manufacture for diagnosing a problem associated with a computing system, comprising a machine readable medium containing one or more programs which when executed implement the steps of:
-
executing one or more probes in accordance with at least a portion of a previously selected probe schedule;
when a result of one or more of the probes of the previously selected probe schedule indicates, at least, a potential problem associated with the computing system, selecting in real-time one or more probes which optimize at least one criterion; and
executing the one or more selected probes so as to diagnose the potential problem.
-
-
20. An article of manufacture for diagnosing a problem associated with a computing system, comprising a machine readable medium containing one or more programs which when executed implement the steps of:
-
selecting online one or more probes which optimize at least one criterion, when a result of an execution of one or more probes of at least a portion of a previously selected probe schedule indicates, at least, a potential problem associated with the computing system; and
executing the one or more selected probes so as to diagnose the potential problem.
-
-
21. A method of providing a problem diagnosis service in accordance with a computing system, comprising the step of:
a service provider providing a problem diagnosis system operative to;
(i) execute one or more probes in accordance with at least a portion of a previously selected probe schedule;
(ii) when a result of one or more of the probes of the previously selected probe schedule indicates, at least, a potential problem associated with the computing system, select in real-time one or more probes which optimize at least one criterion; and
(iii) execute the one or more selected probes so as to diagnose the potential problem.
-
22. A method of providing a problem diagnosis service in accordance with a computing system, comprising the step of:
a service provider providing a problem diagnosis system operative to;
(i) select online one or more probes which optimize at least one criterion, when a result of an execution of one or more probes of at least a portion of a previously selected probe schedule indicates, at least, a potential problem associated with the computing system; and
(ii) executing the one or more selected probes so as to diagnose the potential problem.
Specification