Alert management
First Claim
1. A computer implemented method of handling alerts in a data center that includes multiple components in which a fault in one of the components can result in a cascade of faults in other components, the method comprising:
- receiving, at one or more processing devices, a first alert that indicates a first fault related to a first component of the multiple components;
receiving, at the one or more processing devices, a second alert that indicates a second fault related to a second component of the multiple components, wherein the first component effects the second component such that the first fault caused the second fault;
determining, using the one or more processing devices, a correlation between the first alert and the second alert using a set of rules that is based on a directed graph that reflects dependencies associated with the multiple components, including a dependency of the second component on the first component;
based on the determined correlation, determining that the first fault is a root cause of the first alert and the second alert;
providing an indication that the first fault is the root cause of the first alert and second alert; and
predicting, based on the directed graph, triggering of at least a third alert that indicates a third fault in one of the multiple components wherein the third fault occurs due to the second fault.
3 Assignments
0 Petitions
Accused Products
Abstract
A first alert and a second alert are received. The first alert indicates a first fault related to a first component of the multiple components and the second alert that indicates a second fault related to a second component of the multiple components. The first component affects the second component such that the first fault caused the second fault. A correlation between the first alert and the second alert is determined and, based on the determined correlation, a determination is made that the first fault is a root cause of the first alert and the second alert. An indication that the first fault is the root cause of the first alert and second alert is provided.
33 Citations
18 Claims
-
1. A computer implemented method of handling alerts in a data center that includes multiple components in which a fault in one of the components can result in a cascade of faults in other components, the method comprising:
-
receiving, at one or more processing devices, a first alert that indicates a first fault related to a first component of the multiple components; receiving, at the one or more processing devices, a second alert that indicates a second fault related to a second component of the multiple components, wherein the first component effects the second component such that the first fault caused the second fault; determining, using the one or more processing devices, a correlation between the first alert and the second alert using a set of rules that is based on a directed graph that reflects dependencies associated with the multiple components, including a dependency of the second component on the first component; based on the determined correlation, determining that the first fault is a root cause of the first alert and the second alert; providing an indication that the first fault is the root cause of the first alert and second alert; and predicting, based on the directed graph, triggering of at least a third alert that indicates a third fault in one of the multiple components wherein the third fault occurs due to the second fault. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A data center comprising:
-
multiple, components in which a fault in one of the components can result in a cascade of faults in other components; plurality of monitoring devices configured to monitor the multiple components of the data center and trigger an alert on occurrence of a fault related to a component; and an alarm unit configured to; receive a first alert that indicates a first fault related to a first component of the multiple components, receive a second alert that indicates a second fault related to a second component of the multiple components, wherein the first fault resulted in the second fault, determine a correlation between the first alert and the second alert using a set of rules that is based on a directed graph that reflects dependencies associated with the multiple components, including a dependency of the second component on the first component, based on the determined correlation, determine that the first fault is a root cause of the first and second alerts, provide an indication that the first fault is the root cause of the first alert and the second alert, and predict, based on the directed graph, triggering of at least a third alert that indicates a third fault in one of the multiple components wherein the third fault occurs due to the second fault. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A computer program product, encoded on a computer readable storage device, operable to cause a processing device to perform operations comprising:
-
receiving a first alert that indicates a first fault related to a first component office-multiple components; receiving a second alert that indicates a second fault related to a second component of the multiple components, wherein the first component effects the second component such that the first fault caused the second fault; determining a correlation between the first alert and the second alert using a set of rules that is based on a directed graph that reflects dependencies of the multiple components, including a dependency of the second component on the first component; based on the determined correlation, determining that the first fault is a root cause of the first alert and the second alert; providing an indication that the first fault is the root cause of the first alert and second alert; and predicting, based on the directed graph, triggering of at least a third alert that indicates a third fault in one of the multiple components wherein the third fault occurs due to the second fault. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification