Root cause analysis in a distributed network management architecture
First Claim
1. A method of determining the root cause of an event in a computer network having a distributed network management architecture, the method comprising:
- detecting an event at at least one device component (DC) in said network;
for each source DC at which an event is detected, finding a data path within said network from the source DC'"'"'s underlying network element to that of its acquaintance DC where present; and
identifying as the root cause any of said source DCs and any subject DCs in said data path that have detected an event and either of a) do not have an acquaintance and b) do not have a valid operational state with respect to its acquaintance whereas all other DCs along the data path at lower network layers than the source or subject DC have valid operational states with respect to their acquaintances.
3 Assignments
0 Petitions
Accused Products
Abstract
A method of determining the root cause of an event in a computer network having a distributed network management architecture including detecting an event at at least one device component (DC) in the network, for each source DC at which an event is detected, finding a data path within the network from the source DC'"'"'s underlying network element to that of its acquaintance DC where present, identifying as the root cause any of the source DC and the subject DCs in the data path that have detected an event and either do not have an acquaintance or do not have a valid operational state with respect to its acquaintance whereas all other DCs along the data path at lower network layers than the source or subject DC have valid operational states with respect to their acquaintances.
-
Citations
11 Claims
-
1. A method of determining the root cause of an event in a computer network having a distributed network management architecture, the method comprising:
-
detecting an event at at least one device component (DC) in said network;
for each source DC at which an event is detected, finding a data path within said network from the source DC'"'"'s underlying network element to that of its acquaintance DC where present; and
identifying as the root cause any of said source DCs and any subject DCs in said data path that have detected an event and either of a) do not have an acquaintance and b) do not have a valid operational state with respect to its acquaintance whereas all other DCs along the data path at lower network layers than the source or subject DC have valid operational states with respect to their acquaintances.
-
-
2. In a computer network comprising a plurality of network elements and a network management architecture comprising a plurality of agents, each of the agents corresponding to a different one of the network elements, and a plurality of device components (DC), each of the device components modeling at least one aspect of one of the network elements, the aspect being either of a physical and a functional characteristic of the network element, wherein each of the agents comprises a plurality of the device components, and wherein at least of the two device components within at least one of the agents are logically interconnected, each logical interconnection corresponding to either of a physical and a functional interconnection found within or between any of the network elements, a method of determining the root cause of an event in the distributed network management architecture, the method comprising the steps of:
-
detecting an event at at least one DC in said network;
for each DC at which an event is detected, said DC now referred to as a source DC;
if said source DC does not have an acquaintance DC, determining the root cause of said event to be within said source DCs area of responsibility;
if said source DC does have an acquaintance DC;
finding a data path within said network from said source DC'"'"'s underlying network element to said acquaintance DC'"'"'s underlying network element;
identifying those DCs whose area of responsibility lay along said data path;
for each DC in said data path, now referred to as a subject DC;
if an event is detected at said subject DC;
if said subject DC has an acquaintance DC;
if said subject DC does not have a valid operational state with respect to its acquaintance DC;
if all other DCs along said data path at lower network layers than said subject DC have valid operational states with respect to their acquaintance DCs, determining the root cause of said event to be within the area of responsibility of said subject DC;
if said subject DC has a valid operational state with respect to its acquaintance DC;
if all other DCs along said data path at lower network layers than said subject DC have valid operational states with respect to their acquaintance DCs, determining the root cause of said event to be within the area of responsibility of said source DC; and
if said subject DC does not have an acquaintance DC, determining the root cause of said event to be within the area of responsibility of said subject DC.- View Dependent Claims (3)
-
-
4. A method of determining the root cause of an event in a computer network having a distributed network management architecture, the method comprising the steps of:
-
detecting an event at at least one device component (DC) in said network;
for each DC at which an event is detected, said DC now referred to as a source DC;
if said source DC does not have an acquaintance DC, determining the root cause of said event to be within said source DCs area of responsibility;
if said source DC does have an acquaintance DC;
finding a data path within said network from said source DC'"'"'s underlying network element to said acquaintance DC'"'"'s underlying network element;
identifying those DCs whose area of responsibility lay along said data path;
for each DC in said data path, now referred to as a subject DC;
if an event is detected at said subject DC;
if said subject DC has an acquaintance DC;
if said subject DC does not have a valid operational state with respect to its acquaintance DC;
if all other DCs along said data path at lower network layers than said subject DC have valid operational states with respect to their acquaintance DCs, determining the root cause of said event to be within the area of responsibility of said subject DC;
if said subject DC has a valid operational state with respect to its acquaintance DC;
if all other DCs along said data path at lower network layers than said subject DC have valid operational states with respect to their acquaintance DCs, determining the root cause of said event to be within the area of responsibility of said source DC; and
if said subject DC does not have an acquaintance DC, determining the root cause of said event to be within the area of responsibility of said subject DC.- View Dependent Claims (5)
-
-
6. In a computer network comprising a plurality of network elements and a network management architecture comprising a plurality of agents, each of the agents corresponding to a different one of the network elements, and a plurality of device components (DC), each of the device components modeling at least one aspect of one of the network elements, the aspect being either of a physical and a functional characteristic of the network element, wherein each of the agents comprises a plurality of the device components, and wherein at least of the two device components within at least one of the agents are logically interconnected, each logical interconnection corresponding to either of a physical and a functional interconnection found within or between any of the network elements, a method of identifying network elements that are affected by a root cause event in the distributed network management architecture, the method comprising the steps of:
-
identifying at least one DC in whose area of responsibility a root cause event occurred;
flagging all of said DCs as “
not affected”
by said root cause event;
flagging said DC in whose area of responsibility a root cause event occurred as a “
propagation candidate”
;
initiating a message specific to the root cause event;
for each DC flagged as a propagation candidate;
flagging said DC flagged as a propagation candidate as an “
affected candidate”
;
if the DC flagged as an affected candidate should ignore said message, flagging said DC flagged as an affected candidate as “
not affected”
;
if the DC flagged as an affected candidate is required to propagate said message or a transformation thereof to at least one neighbor DC;
propagating the message or a transformation thereof to said neighbor DCs; and
flagging said neighbor DCs as “
propagation candidates”
,wherein said DCs flagged as an affected candidate represent those network elements that are affected by said root cause event. - View Dependent Claims (7, 8)
-
-
9. A method of identifying network elements that are affected by a root cause event in a computer network having a distributed network management architecture, the method comprising the steps of:
-
identifying at least one device component (DC) in whose area of responsibility a root cause event occurred;
flagging all of said DCs as “
not affected”
by said root cause event;
flagging said DC in whose area of responsibility a root cause event occurred as a “
propagation candidate”
;
initiating a message specific to the root cause event;
for each DC flagged as a propagation candidate;
flagging said DC flagged as a propagation candidate as an “
affected candidate”
;
if the DC flagged as an affected candidate should ignore said message, flagging said DC flagged as an affected candidate as “
not affected”
;
if the DC flagged as an affected candidate is required to propagate said message or a transformation thereof to at least one neighbor DC;
propagating the message or a transformation thereof to said neighbor DCs; and
flagging said neighbor DCs as “
propagation candidates”
,wherein said DCs flagged as an affected candidate represent those network elements that are affected by said root cause event. - View Dependent Claims (10, 11)
-
Specification