Automated failure recovery of subsystems in a management system
First Claim
1. A system comprising:
- at least one memory storage device to store a plurality of microkernel controllers and a plurality of pre-defined topologies; and
a service manager comprising one or more hardware processors, the plurality of microkernel controllers including at least one microkernel controller configured to deploy the service manager, the service manager configured to perform operations comprising;
parsing a first pre-defined topology from the plurality of pre-defined topologies, the first pre-defined topology specifying a plurality of logical types of resources to be used by a management service;
loading one or more controllers corresponding to the management service;
deploying the management service using the first pre-defined topology;
monitoring operation of the management service; and
dynamically re-deploying the management service using the first pre-defined topology responsive to a failure of the management service.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for automated failure recovery of subsystems of a management system are described. The subsystems are built and modeled as services, and their management, specifically their failure recovery, is done in a manner similar to that of services and resources managed by the management system. The management system consists of a microkernel, service managers, and management services. Each service, whether a managed service or a management service, is managed by a service manager. The service manager itself is a service and so is in turn managed by the microkernel. Both managed services and management services are monitored via in-band and out-of-band mechanisms, and the performance metrics and alerts are transported through an event system to the appropriate service manager. If a service fails, the service manager takes policy-based remedial steps including, for example, restarting the failed service.
26 Citations
16 Claims
-
1. A system comprising:
-
at least one memory storage device to store a plurality of microkernel controllers and a plurality of pre-defined topologies; and a service manager comprising one or more hardware processors, the plurality of microkernel controllers including at least one microkernel controller configured to deploy the service manager, the service manager configured to perform operations comprising; parsing a first pre-defined topology from the plurality of pre-defined topologies, the first pre-defined topology specifying a plurality of logical types of resources to be used by a management service; loading one or more controllers corresponding to the management service; deploying the management service using the first pre-defined topology; monitoring operation of the management service; and dynamically re-deploying the management service using the first pre-defined topology responsive to a failure of the management service. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A computer-implemented method comprising:
-
receiving a pre-defined computing topology that specifies a plurality of logical types of computing resources to be used by a management service, the management service being one of a plurality of management services configured to collectively manage a plurality of domain services, the plurality of management services configured to collectively monitor the domain services and dynamically allocate computing resources to the plurality of domain services; parsing the pre-defined computing topology; loading one or more computing controllers corresponding to the management service; deploying the management service using the pre-defined computing topology; monitoring operation of the management service; and dynamically re-deploying the management service using the pre-defined computing topology responsive to a failure of the management service, the dynamically re-deploying of the management service including; allocating a replacement node; and moving the failed management service to the replacement node. - View Dependent Claims (7, 8, 9, 10)
-
-
11. A non-transitory computer-readable storage medium having instructions embodied thereon, the instructions executable by a processor for performing operations for managing one or more components of a management system, the operations comprising:
-
initiating a microkernel; initiating a microkernel controller; deploying, at the microkernel, a service manager using the microkernel controller; parsing, at the service manager, a pre-defined topology that specifies a plurality of logical types of resources to be used by a management service, the management service being one of a plurality of management services configured to collectively manage a plurality of domain services, the plurality of management services configured to collectively monitor the domain services and dynamically allocate resources to the plurality of domain services; loading, at the service manager, one or more controllers corresponding to the management service; deploying the management service using the pre-defined topology; detecting a failure of the management service; and dynamically re-deploying the management service using the pre-defined topology responsive to the failure of the management service. - View Dependent Claims (12, 13, 14, 15, 16)
-
Specification