Method and system for a network management framework with redundant failover methodology
First Claim
1. A method for management of a distributed data processing system, the method comprising:
- representing the distributed data processing system as a set of scopes, wherein a scope comprises a logical organization of network-related objects;
monitoring, by a computer, resources within the distributed data processing system using a set of distributed monitor controllers, wherein each distributed monitor controller is uniquely responsible for monitoring resources within different scopes;
in response to monitoring a set of resources, generating topology information associated with the set of resources by a first instance of a distributed monitor controller in the set of distributed monitor controllers;
in response to detecting a potential failure of the first instance of the distributed monitor controller, starting a second instance of the distributed monitor controller;
in response to monitoring the set of resources, generating topology information associated with the set of resources by the second instance of the distributed monitor controller; and
in response to a determination that generated topology information indicates assignment of overlapping scopes between the first instance of the distributed monitor controller and the second instance of the distributed monitor controller, determining a failure of the first instance of the distributed monitor controller based on a communication test.
1 Assignment
0 Petitions
Accused Products
Abstract
A method, system, apparatus, and computer program product is presented for management of a distributed data processing system. Resources within the distributed data processing system are dynamically discovered, and the discovered resources are adaptively monitored using the network management framework. When the network management framework detects that certain components within the network management framework may have failed, new instances of these components are started. If duplicate components are later determined to be active concurrently, then a duplicate component is shutdown, thereby ensuring that at least one instance of these components is active at any given time. After certain failover events, a resource rediscovery process may occur, and a topology database containing previously stored information about discovered resources is resynchronized with resource information about rediscovered resources.
45 Citations
27 Claims
-
1. A method for management of a distributed data processing system, the method comprising:
-
representing the distributed data processing system as a set of scopes, wherein a scope comprises a logical organization of network-related objects; monitoring, by a computer, resources within the distributed data processing system using a set of distributed monitor controllers, wherein each distributed monitor controller is uniquely responsible for monitoring resources within different scopes; in response to monitoring a set of resources, generating topology information associated with the set of resources by a first instance of a distributed monitor controller in the set of distributed monitor controllers; in response to detecting a potential failure of the first instance of the distributed monitor controller, starting a second instance of the distributed monitor controller; in response to monitoring the set of resources, generating topology information associated with the set of resources by the second instance of the distributed monitor controller; and in response to a determination that generated topology information indicates assignment of overlapping scopes between the first instance of the distributed monitor controller and the second instance of the distributed monitor controller, determining a failure of the first instance of the distributed monitor controller based on a communication test. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. An apparatus for management of a distributed data processing system, the apparatus comprising:
-
means for representing the distributed data processing system as a set of scopes, wherein a scope comprises a logical organization of network-related objects; means for monitoring resources within the distributed data processing system using a set of distributed monitor controllers, wherein each distributed monitor controller is uniquely responsible for monitoring resources within different scopes; means for generating topology information associated with a set of resources by a first instance of a distributed monitor controller in the set of distributed monitor controllers in response to monitoring the set of resources; means for starting a second instance of the distributed monitor controller in response to detecting a potential failure of the first instance of the distributed monitor controller; means for generating topology information associated with the set of resources by the second instance of the distributed monitor controller in response to monitoring the set of resources; and means for determining a failure of the first instance of the distributed monitor controller based on a communication test in response to a determination that generated topology information indicates assignment of overlapping scopes between the first instance of the distributed monitor controller and the second instance of the distributed monitor controller. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A computer program product on a non-transitory computer readable medium for use in managing a distributed data processing system, the computer program product comprising:
-
instructions for representing the distributed data processing system as a set of scopes, wherein a scope comprises a logical organization of network-related objects; instructions for monitoring resources within the distributed data processing system using a set of distributed monitor controllers, wherein each distributed monitor controller is uniquely responsible for monitoring resources within different scopes; instructions for generating topology information associated with a set of resources by a first instance of a distributed monitor controller in the set of distributed monitor controllers in response to monitoring the set of resources; instructions for starting a second instance of the distributed monitor controller in response to detecting a potential failure of the first instance of the distributed monitor controller; instructions for generating topology information associated with the set of resources by the second instance of the distributed monitor controller in response to monitoring the set of resources; and instructions for determining a failure of the first instance of the distributed monitor controller based response to a determination that generated topology information indicates assignment of overlapping scopes between the first instance of the distributed monitor controller and the second instance of the distributed monitor controller. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27)
-
Specification