Automated root cause analysis of problems associated with software application deployments
First Claim
1. A computer-implemented method of finding a root cause of one or more problems with a deployment of a software application within a managed computer system, comprising:
- discovering, after installation of the deployment, information about the deployment and software services that are not provided by the deployment, by sending software probes onto a communication network;
using the discovered information to create an application model describing the deployment, the application model comprising;
an object graph representing hardware and software elements of the managed computer system, the elements represented as objects of the graph;
configuration data about the elements; and
information about relationships between the elements;
dynamically modifying the application model on an ongoing basis in response to detected changes of configuration of the deployment;
accessing a repository of logic rules generated from analyses of reference materials for the software application and empirical analyses done on a plurality of different deployments of the software application, each rule corresponding to a known problem associated with the software application, each rule configured to be used to test for satisfaction of one or more possible conditions of the deployment;
using at least one of the rules to determine that one or more of the possible conditions are satisfied by the deployment;
marking objects of the application model that are associated with the satisfied conditions; and
using pattern-recognition on a portion of the application model that includes some of the marked objects to identify one or more root cause candidates associated with the marked objects, the root cause candidates comprising elements of the application model, at least some of the root cause candidates not being marked objects;
wherein said discovering, using the discovered information, dynamically modifying, accessing, using at least one of the rules, marking, and using pattern-recognition are executed based on code that is separate from code of the software application of the managed computer system.
8 Assignments
0 Petitions
Accused Products
Abstract
Computer systems and methods are disclosed for managing a deployment of a software application. One system includes an application model describing the deployment, the application model comprising a representation of physical and logical objects in a domain of the deployment, configuration data about the objects, and information about relationships between the objects. The system also includes a root cause analysis module configured to identify one or more problematic objects of the application model, and to use pattern-recognition on the application model to find root cause candidates that may be a root cause of one or more problems associated with the problematic objects. The root cause analysis module can be further configured to apply diagnostic unit tests on one or more objects associated with the root cause candidates, the diagnostic unit tests configured to narrow down a list of possible root causes of the problems. Various pattern-recognition techniques are disclosed, including looking for recent property or configurational changes in the application model, clustering of problematic objects, examining links between objects in the application model, comparisons between pairs of non-problematic objects as well as between problematic objects and non-problematic objects, temporal comparisons of the state of the application model, and examining objects that are near the problematic objects in the application model.
-
Citations
35 Claims
-
1. A computer-implemented method of finding a root cause of one or more problems with a deployment of a software application within a managed computer system, comprising:
-
discovering, after installation of the deployment, information about the deployment and software services that are not provided by the deployment, by sending software probes onto a communication network; using the discovered information to create an application model describing the deployment, the application model comprising; an object graph representing hardware and software elements of the managed computer system, the elements represented as objects of the graph; configuration data about the elements; and information about relationships between the elements; dynamically modifying the application model on an ongoing basis in response to detected changes of configuration of the deployment; accessing a repository of logic rules generated from analyses of reference materials for the software application and empirical analyses done on a plurality of different deployments of the software application, each rule corresponding to a known problem associated with the software application, each rule configured to be used to test for satisfaction of one or more possible conditions of the deployment; using at least one of the rules to determine that one or more of the possible conditions are satisfied by the deployment; marking objects of the application model that are associated with the satisfied conditions; and using pattern-recognition on a portion of the application model that includes some of the marked objects to identify one or more root cause candidates associated with the marked objects, the root cause candidates comprising elements of the application model, at least some of the root cause candidates not being marked objects; wherein said discovering, using the discovered information, dynamically modifying, accessing, using at least one of the rules, marking, and using pattern-recognition are executed based on code that is separate from code of the software application of the managed computer system. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A meta-application system for managing a computer system including a deployment of a software application, comprising:
-
a repository of logic rules generated from analyses of reference materials for the software application and empirical analyses done on a plurality of different deployments of the software application, each rule corresponding to a known problem associated with the software application, each rule configured to be used to test for satisfaction of one or more possible conditions of the deployment; an application model describing the deployment, the application model comprising; an object graph representing hardware and software elements of the managed computer system, the elements represented as objects of the object graph; configuration data about the elements; and information about relationships between the elements; a discovery component configured to conduct, after installation of the deployment, automated discovery of information about at least the deployment, by sending software probes onto a communication network, the discovery component configured to discover software services that are not provided by the deployment or the meta-application system, the discovery component configured to use the discovered information to create the application model, wherein the discovery component is configured to, after creating the application model, dynamically modify the application model on an ongoing basis in response to detected changes of configuration of the deployment; an analysis subsystem comprising; a problem detector configured to use one or more of the rules to determine that one or more of the possible conditions of the rules is satisfied by the deployment; and a root cause analysis module configured to; mark application model objects that (1) are associated with the satisfied conditions of the rules, and/or (2) correspond to elements in the deployment whose behavior has deviated from mathematical models created by the meta-application system; and use pattern-recognition on a portion of the application model that includes the marked objects to find a root cause candidate that may be a root cause of one or more problems associated with one or more of the marked objects of the object graph, at least some of the root cause candidates not being marked objects; and one or more computer systems operative to implement the application model, the discovery component, and the root cause analysis module; wherein the application model, discovery component, and root cause analysis module comprise code that is separate from code of the software application of the managed computer system. - View Dependent Claims (19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35)
-
Specification