Method and system for monitoring distributed applications on-demand
First Claim
1. In a data processing system including a plurality of processing entities, a method of monitoring a distributed application suitable to run on at least one of the processing entities, the method including the steps of:
- surveying the processing entities to detect a change between a running condition and a non-running condition of the distributed application on each processing entity,responsive to detecting a change from the non-running condition to the running condition, enabling a monitoring application for monitoring performance of the distributed application on each of the processing entities where the change to the running condition has been detected,responsive to detecting a change from the running condition to the non-running condition, disabling the monitoring application on each of the processing entities where the change to the non-running condition has been detected, wherein the processing entities are grouped into a cluster being controlled by a controller entity that automatically performs the surveying, enabling and disabling steps for each of the processing entities without user intervention, wherein the controller entity installs the distributed application on the processing entities that are grouped into the cluster;
an authority entity publishing a plurality of rules each one defining a target state for a category of subjects, wherein the plurality of rules are published to a common rule repository, wherein;
(i) the authority entity publishing a first rule to the common rule repository for a first category defined by the subjects having the distributed application in the running condition, the target state of the first rule specifying the enabling of the monitoring application, and a second rule to the common rule repository for a second category defined by the subjects having the distributed application in the non-running condition, the target state of the second rule specifying the disabling of the monitoring application, each processing entity having the distributed application in the running condition self-applies the first rule to itself to enable the monitoring application, wherein the first rule is self-applied in response to the distributed application being placed in the running condition by the controller entity, and each processing entity having the distributed application in the non-running condition self-applies the second rule to itself to disable the monitoring application, wherein the second rule is self-applied in response to the distributed application being placed in the non-running condition by the controller entity;
(ii) the authority entity publishing a third rule to the common rule repository for a third category defined by the subjects classified as processing entities, the target state of the third rule specifying the addition of the processing entity to the cluster, each enabled monitoring application sending monitoring information to a collector entity, the collector entity detecting a critical condition according to the monitoring information, the collector entity causing at least one new subject to be automatically classified without user intervention as a new processing entity in response to the critical condition, and each new processing entity applying the third rule in order to be added to the cluster, wherein the third rule includes a formal parameter defining a correlation with a fourth category defined by the subjects classified as controller entities, the step of applying the third rule including;
retrieving the third rule from the common rule repository, identifying the subject belonging to the fourth category, and resolving the formal parameter into the identified subject to thereby identify the controller entity that controls the cluster that the each new processing entity is added to;
(iii) the authority entity publishing a fourth rule to the common rule repository for the fourth category, the target state of the fourth rule specifying the reconfiguration of the cluster in response to the addition of each new processing entity to the cluster, and the controller entity applying the fourth rule to reconfigure the cluster in order to account for the addition of each new processing entity to the cluster; and
each subject, which is a given processing entity of the processing entities, reading directly from the common rule repository and applying each rule for the corresponding category to configure the subject according to the target state defined in the rule.
1 Assignment
0 Petitions
Accused Products
Abstract
A method (300;600) and system for monitoring distributed applications (for example, running on multiple WAS nodes of a cluster) is proposed. The solution of the invention is based on a self-adaptive resource management infrastructure. Particularly, an authority publishes (306-312) a plurality of rules, each one defining a desired target configuration for a category of subjects. A membership controller on each subject is responsible to assign (315-333) the subject to the respective category; a compliance engine then retrieves and applies (336-351) the rules corresponding to the category of the subject. The resource management infrastructure is used to implement a monitoring on-demand of the distributed application. For this purpose, two rules are defined (603-606) for the WAS nodes having the distributed application in a running condition and in a non-running condition, respectively. Each WAS node having the distributed application in the running condition applies (639-645) the first rule, so as to start the monitoring application; as soon as the distributed application switches to the non-running condition, the WAS node applies (654-657) the second rule, so as to stop the monitoring application automatically.
-
Citations
5 Claims
-
1. In a data processing system including a plurality of processing entities, a method of monitoring a distributed application suitable to run on at least one of the processing entities, the method including the steps of:
-
surveying the processing entities to detect a change between a running condition and a non-running condition of the distributed application on each processing entity, responsive to detecting a change from the non-running condition to the running condition, enabling a monitoring application for monitoring performance of the distributed application on each of the processing entities where the change to the running condition has been detected, responsive to detecting a change from the running condition to the non-running condition, disabling the monitoring application on each of the processing entities where the change to the non-running condition has been detected, wherein the processing entities are grouped into a cluster being controlled by a controller entity that automatically performs the surveying, enabling and disabling steps for each of the processing entities without user intervention, wherein the controller entity installs the distributed application on the processing entities that are grouped into the cluster; an authority entity publishing a plurality of rules each one defining a target state for a category of subjects, wherein the plurality of rules are published to a common rule repository, wherein; (i) the authority entity publishing a first rule to the common rule repository for a first category defined by the subjects having the distributed application in the running condition, the target state of the first rule specifying the enabling of the monitoring application, and a second rule to the common rule repository for a second category defined by the subjects having the distributed application in the non-running condition, the target state of the second rule specifying the disabling of the monitoring application, each processing entity having the distributed application in the running condition self-applies the first rule to itself to enable the monitoring application, wherein the first rule is self-applied in response to the distributed application being placed in the running condition by the controller entity, and each processing entity having the distributed application in the non-running condition self-applies the second rule to itself to disable the monitoring application, wherein the second rule is self-applied in response to the distributed application being placed in the non-running condition by the controller entity; (ii) the authority entity publishing a third rule to the common rule repository for a third category defined by the subjects classified as processing entities, the target state of the third rule specifying the addition of the processing entity to the cluster, each enabled monitoring application sending monitoring information to a collector entity, the collector entity detecting a critical condition according to the monitoring information, the collector entity causing at least one new subject to be automatically classified without user intervention as a new processing entity in response to the critical condition, and each new processing entity applying the third rule in order to be added to the cluster, wherein the third rule includes a formal parameter defining a correlation with a fourth category defined by the subjects classified as controller entities, the step of applying the third rule including;
retrieving the third rule from the common rule repository, identifying the subject belonging to the fourth category, and resolving the formal parameter into the identified subject to thereby identify the controller entity that controls the cluster that the each new processing entity is added to;(iii) the authority entity publishing a fourth rule to the common rule repository for the fourth category, the target state of the fourth rule specifying the reconfiguration of the cluster in response to the addition of each new processing entity to the cluster, and the controller entity applying the fourth rule to reconfigure the cluster in order to account for the addition of each new processing entity to the cluster; and each subject, which is a given processing entity of the processing entities, reading directly from the common rule repository and applying each rule for the corresponding category to configure the subject according to the target state defined in the rule. - View Dependent Claims (2)
-
-
3. In a data processing system including a plurality of processing entities, a computer program including program code means directly loadable into a working memory of the system for performing a method of monitoring a distributed application suitable to run on at least one of the processing entities when the program is run on the system, the method including the steps of:
-
surveying the processing entities to detect a change between a running condition and a non-running condition of the distributed application on each processing entity, responsive to detecting a change from the non-running condition to the running condition, enabling a monitoring application for monitoring performance of the distributed application on each of the processing entities where the change to the running condition has been detected, and responsive to detecting a change from the running condition to the non-running condition, disabling the monitoring application on each of the processing entities where the change to the non-running condition has been detected, wherein the processing entities are grouped into a cluster being controlled by a controller entity that automatically performs the surveying, enabling and disabling steps for each of the processing entities without user intervention, wherein the controller entity installs the distributed application on the processing entities that are grouped into the cluster; an authority entity publishing a plurality of rules each one defining a target state for a category of subjects, wherein the plurality of rules are published to a common rule repository, wherein; (i) the authority entity publishing a first rule to the common rule repository for a first category defined by the subjects having the distributed application in the running condition, the target state of the first rule specifying the enabling of the monitoring application, and a second rule to the common rule repository for a second category defined by the subjects having the distributed application in the non-running condition, the target state of the second rule specifying the disabling of the monitoring application, each processing entity having the distributed application in the running condition self-applies the first rule to itself to enable the monitoring application, wherein the first rule is self-applied in response to the distributed application being placed in the running condition by the controller entity, and each processing entity having the distributed application in the non-running condition self-applies the second rule to itself to disable the monitoring application, wherein the second rule is self-applied in response to the distributed application being placed in the non-running condition by the controller entity; (ii) the authority entity publishing a third rule to the common rule repository for a third category defined by the subjects classified as processing entities, the target state of the third rule specifying the addition of the processing entity to the cluster, each enabled monitoring application sending monitoring information to a collector entity, the collector entity detecting a critical condition according to the monitoring information, the collector entity causing at least one new subject to be automatically classified without user intervention as a new processing entity in response to the critical condition, and each new processing entity applying the third rule in order to be added to the cluster, wherein the third rule includes a formal parameter defining a correlation with a fourth category defined by the subjects classified as controller entities, the step of applying the third rule including;
retrieving the third rule from the common rule repository, identifying the subject belonging to the fourth category, and resolving the formal parameter into the identified subject to thereby identify the controller entity that controls the cluster that the each new processing entity is added to;(iii) the authority entity publishing a fourth rule to the common rule repository for the fourth category, the target state of the fourth rule specifying the reconfiguration of the cluster in response to the addition of each new processing entity to the cluster, and the controller entity applying the fourth rule to reconfigure the cluster in order to account for the addition of each new processing entity to the cluster; and each subject, which is a given processing entity of the processing entities, reading directly from the common rule repository and applying each rule for the corresponding category to configure the subject according to the target state defined in the rule.
-
-
4. In a data processing system including a plurality of processing entities, a program product including a computer readable medium embodying a computer program, the computer program being directly loadable into a working memory of the system for performing a method of monitoring a distributed application suitable to run on at least one of the processing entities when the program is run on the system, the method including the steps of:
-
surveying the processing entities to detect a change between a running condition and a non-running condition of the distributed application on each processing entity, responsive to detecting a change from the non-running condition to the running condition, enabling a monitoring application for monitoring performance of the distributed application on each of the processing entities where the change to the running condition has been detected, and responsive to detecting a change from the running condition to the non-running condition, disabling the monitoring application on each of the processing entities where the change to the non-running condition has been detected, wherein the processing entities are grouped into a cluster being controlled by a controller entity that automatically performs the surveying, enabling and disabling steps for each of the processing entities without user intervention, wherein the controller entity installs the distributed application on the processing entities that are grouped into the cluster; an authority entity publishing a plurality of rules each one defining a target state for a category of subjects, wherein the plurality of rules are published to a common rule repository, wherein; (i) the authority entity publishing a first rule to the common rule repository for a first category defined by the subjects having the distributed application in the running condition, the target state of the first rule specifying the enabling of the monitoring application, and a second rule to the common rule repository for a second category defined by the subjects having the distributed application in the non-running condition, the target state of the second rule specifying the disabling of the monitoring application, each processing entity having the distributed application in the running condition self-applies the first rule to itself to enable the monitoring application, wherein the first rule is self-applied in response to the distributed application being placed in the running condition by the controller entity, and each processing entity having the distributed application in the non-running condition self-applies the second rule to itself to disable the monitoring application, wherein the second rule is self-applied in response to the distributed application being placed in the non-running condition by the controller entity; (ii) the authority entity publishing a third rule to the common rule repository for a third category defined by the subjects classified as processing entities, the target state of the third rule specifying the addition of the processing entity to the cluster, each enabled monitoring application sending monitoring information to a collector entity, the collector entity detecting a critical condition according to the monitoring information, the collector entity causing at least one new subject to be automatically classified without user intervention as a new processing entity in response to the critical condition, and each new processing entity applying the third rule in order to be added to the cluster, wherein the third rule includes a formal parameter defining a correlation with a fourth category defined by the subjects classified as controller entities, the step of applying the third rule including;
retrieving the third rule from the common rule repository, identifying the subject belonging to the fourth category, and resolving the formal parameter into the identified subject to thereby identify the controller entity that controls the cluster that the each new processing entity is added to;(iii) the authority entity publishing a fourth rule to the common rule repository for the fourth category, the target state of the fourth rule specifying the reconfiguration of the cluster in response to the addition of each new processing entity to the cluster, and the controller entity applying the fourth rule to reconfigure the cluster in order to account for the addition of each new processing entity to the cluster; and each subject, which is a given processing entity of the processing entities, reading directly from the common rule repository and applying each rule for the corresponding category to configure the subject according to the target state defined in the rule.
-
-
5. In a data processing system including a plurality of processing entities, a system for monitoring a distributed application suitable to run on at least one of the processing entities, the system including:
-
means for surveying the processing entities to detect a change between a running condition and a non-running condition of the distributed application on each processing entity, means, responsive to detecting a change from the non-running condition to the running condition, for enabling a monitoring application for monitoring performance of the distributed application on each of the processing entities where the change to the running condition has been detected, and means, responsive to detecting a change from the running condition to the non-running condition, for disabling the monitoring application on each of the processing entities where the change to the non-running condition has been detected, wherein the processing entities are grouped into a cluster being controlled by a controller entity that includes the means for surveying, means for enabling and means for disabling for each of the processing entities, wherein the controller entity installs the distributed application on the processing entities that are grouped into the cluster; an authority entity publishing a plurality of rules each one defining a target state for a category of subjects, wherein the plurality of rules are published to a common rule repository, wherein; (i) the authority entity publishing a first rule to the common rule repository for a first category defined by the subjects having the distributed application in the running condition, the target state of the first rule specifying the enabling of the monitoring application, and a second rule to the common rule repository for a second category defined by the subjects having the distributed application in the non-running condition, the target state of the second rule specifying the disabling of the monitoring application, each processing entity having the distributed application in the running condition self-applies the first rule to itself to enable the monitoring application, wherein the first rule is self-applied in response to the distributed application being placed in the running condition by the controller entity, and each processing entity having the distributed application in the non-running condition self-applies the second rule to itself to disable the monitoring application, wherein the second rule is self-applied in response to the distributed application being placed in the non-running condition by the controller entity; (ii) the authority entity publishing a third rule to the common rule repository for a third category defined by the subjects classified as processing entities, the target state of the third rule specifying the addition of the processing entity to the cluster, each enabled monitoring application sending monitoring information to a collector entity, the collector entity detecting a critical condition according to the monitoring information, the collector entity causing at least one new subject to be automatically classified without user intervention as a new processing entity in response to the critical condition, and each new processing entity applying the third rule in order to be added to the cluster, wherein the third rule includes a formal parameter defining a correlation with a fourth category defined by the subjects classified as controller entities, the step of applying the third rule including;
retrieving the third rule from the common rule repository. identifying the subject belonging to the fourth category, and resolving the formal parameter into the identified subject to thereby identify the controller entity that controls the cluster that the each new processing entity is added to;(iii) the authority entity publishing a fourth rule to the common rule repository for the fourth category, the target state of the fourth rule specifying the reconfiguration of the cluster in response to the addition of each new processing entity to the cluster, and the controller entity applying the fourth rule to reconfigure the cluster in order to account for the addition of each new processing entity to the cluster; and each subject, which is a given processing entity of the processing entities, comprising means for reading directly from the common rule repository and applying each rule for the corresponding category to configure the subject according to the target state defined in the rule.
-
Specification