AUTOMATED FAILURE RECOVERY OF SUBSYSTEMS IN A MANAGEMENT SYSTEM
First Claim
1. A system comprising:
- at least one memory storage device to store a plurality microkernel controllers and a plurality of pre-defined topologies; and
a service manager, comprising one or more processors, configured to perform operations comprising;
parsing a first pre-defined topology from the plurality of pre-defined topologies, the first pre-defined topology specifying a plurality of logical types of resources to be used by a management service;
loading one or more controllers corresponding to the management service;
deploying the management service using the first pre-defined topology;
monitoring operation of the management service; and
dynamically re-deploying the management service using the first pre-defined topology responsive to a failure of the management service.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for automated failure recovery of subsystems of a management system are described. The subsystems are built and modeled as services, and their management, specifically their failure recovery, is done in a manner similar to that of services and resources managed by the management system. The management system consists of a microkernel, service managers, and management services. Each service, whether a managed service or a management service, is managed by a service manager. The service manager itself is a service and so is in turn managed by the microkernel. Both managed services and management services are monitored via in-band and out-of-band mechanisms, and the performance metrics and alerts are transported through an event system to the appropriate service manager. If a service fails, the service manager takes policy-based remedial steps including, for example, restarting the failed service.
6 Citations
20 Claims
-
1. A system comprising:
-
at least one memory storage device to store a plurality microkernel controllers and a plurality of pre-defined topologies; and a service manager, comprising one or more processors, configured to perform operations comprising; parsing a first pre-defined topology from the plurality of pre-defined topologies, the first pre-defined topology specifying a plurality of logical types of resources to be used by a management service; loading one or more controllers corresponding to the management service; deploying the management service using the first pre-defined topology; monitoring operation of the management service; and dynamically re-deploying the management service using the first pre-defined topology responsive to a failure of the management service. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method comprising:
-
receiving a pre-defined topology that specifies a plurality of logical types of resources to be used by a management service, the management service being one of a plurality of management services configured to collectively manage a plurality of domain services, the plurality of management services configured to collectively monitor the domain services and dynamically allocate resources to the plurality of domain services; parsing the pre-defined topology; loading one or more controllers corresponding to the management service; deploying the management service using the pre-defined topology; monitoring operation of the management service; and dynamically re-deploying the management service using the pre-defined topology responsive to a failure of the management service. - View Dependent Claims (9, 10, 11, 12)
-
-
13. A non-transitory computer readable storage medium having instructions embodied thereon, the instructions executable by a processor for performing operations for managing one or more components of a management system, the operations comprising:
-
parsing, at a service manager, a pre-defined topology that specifies a plurality of logical types of resources to be used by a management service, the management service being one of a plurality of management services configured to collectively manage a plurality of domain services, the plurality of management services configured to collectively monitor the domain services and dynamically allocate resources to the plurality of domain services; loading, at the service manager, one or more controllers corresponding to the management service; deploying the management service using the pre-defined topology; detecting a failure of the management service; and dynamically re-deploying the management service using the pre-defined topology responsive to the failure of the management service. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
Specification