Methods and systems that diagnose and manage undesirable operational states of computing facilities
First Claim
1. A computing-facility monitoring-and-management system comprising:
- one or more processors;
one or more memories;
one or more mass-storage devices;
operational state data stored in one or more of the one or more memories, the operational state data comprising previously encountered undesirable computing-facility operational states and administrative steps taken to address the previously encountered undesirable computing-facility operational states; and
computer instructions, stored in one or more of the one or more memories, wherein the instructions, when executed by one or more of the one or more processors, cause the computing-facility monitoring-and-management system to at least;
determine that a current operational state of the computing-facility monitoring-and-management system is undesirable, based on the current operational state being a threshold distance outside of an n-dimensional volume of desirable operational states for the computing-facility monitoring-and-management system;
identify, within the operational state data, information associated with one or more previously determined undesirable operational states similar to the current operational state of the computing-facility;
use the identified information to determine a set of one or more administrative steps to address the determined current undesirable operational state; and
carry out the identified administrative steps to drive the computing facility into a desirable operational state.
2 Assignments
0 Petitions
Accused Products
Abstract
The current document is directed to automatically, semi-automatically, and/or manually monitoring a computing facility to detect and address undesirable operational states in computing facilities, including large distributed computing systems. The currently disclosed monitoring methods and systems employ case-based inference to diagnose and ameliorate undesirable operational states. In disclosed implementations, a database is maintained to store and provide access to records of previously handled undesirable operational states and the actions taken to remediate the undesirable operational state is maintained in order to facilitate case-based reasoning inference.
-
Citations
20 Claims
-
1. A computing-facility monitoring-and-management system comprising:
-
one or more processors; one or more memories; one or more mass-storage devices; operational state data stored in one or more of the one or more memories, the operational state data comprising previously encountered undesirable computing-facility operational states and administrative steps taken to address the previously encountered undesirable computing-facility operational states; and computer instructions, stored in one or more of the one or more memories, wherein the instructions, when executed by one or more of the one or more processors, cause the computing-facility monitoring-and-management system to at least; determine that a current operational state of the computing-facility monitoring-and-management system is undesirable, based on the current operational state being a threshold distance outside of an n-dimensional volume of desirable operational states for the computing-facility monitoring-and-management system; identify, within the operational state data, information associated with one or more previously determined undesirable operational states similar to the current operational state of the computing-facility; use the identified information to determine a set of one or more administrative steps to address the determined current undesirable operational state; and carry out the identified administrative steps to drive the computing facility into a desirable operational state. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A method that carries out administrative steps to drive a computing facility into a desirable operational state, the method comprising:
-
determining, by a computing-facility monitoring-and-management system, that a current operational state of the computing facility is undesirable, based on the current operational state being a threshold distance outside of an n-dimensional volume of desirable operational states for the computing-facility monitoring-and-management system; identifying, by the computing-facility monitoring-and-management system, one or more information sets stored by the computing-facility monitoring-and-management system that contain operational state data associated with one or more previously determined undesirable operational states similar to the current undesirable operational state of the computing-facility; using, by the computing-facility monitoring-and-management system, the identified one or more information sets to determine a set of one or more administrative steps to address the determined current undesirable operational state; and carrying out, by the computing-facility monitoring-and-management system, the identified administrative steps to drive the computing facility into a desirable operational state. - View Dependent Claims (17, 18, 19)
-
-
20. Computer instructions encoded in a physical data-storage device within a computing-facility monitoring-and-management system having one or more processors, one or more memories, one or more mass-storage device, and operational state data stored in the one or more memories, the operational state data comprising previously encountered undesirable computing-facility operational states and administrative steps taken to address the previously encountered undesirable computing-facility operational states, wherein the instructions, when executed by one or more of the one or more processors, cause the computing-facility monitoring-and-management system to at least:
-
determine that a current operational state of the computing facility is undesirable, based on the current operational state being a threshold distance outside of an n-dimensional volume of desirable operational states for the computing-facility monitoring-and-management system; identify one or more information sets stored by the computing-facility monitoring-and-management system that contain operational state data, the information being associated with one or more previously determined undesirable operational states similar to the current undesirable operational state of the computing-facility; use the identified one or more information sets to determine a set of one or more administrative steps to address the determined current undesirable operational state; and carry out the identified administrative steps to drive the computing facility into a desirable operational state.
-
Specification