Increasing resilience of a network service

US 8,869,035 B2
Filed: 06/29/2009
Issued: 10/21/2014
Est. Priority Date: 06/29/2009
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

obtaining a set of data representing a graph of a computer network having a set of hardware nodes and a set of hardware links between the hardware nodes, the hardware links being represented as edges in the graph;

finding a first subset of the set of hardware nodes, such that those of the hardware nodes in the first subset are able to withstand a maximum number of failures before the graph disconnects, the failures comprising at least one of node failures and edge failures; and

ranking the hardware nodes in the first subset based on expected resiliency, to obtain a ranked list;

wherein said expected resiliency is computed via E[R_m(v)]=Σ

_fε

2_EP(f)N(v, f), wherein;

E represents a set of edges among the first subset of hardware nodes f;

P(f) represents a probability of all edges associated with the first subset f failing together;

R_m(v) represents a resiliency measure of a service deployed at a given node v; and

N(v, f) represents the number of nodes that can be reached from the given node v if all edges in the first subset f fail together; and

wherein said ranking comprises;

identifying edge and vertex independent paths from each hardware node in the first subset to all other hardware nodes;

weighting each of the edge and vertex independent paths with the estimated failure probability of the vertex and the edges in each independent path; and

ranking the hardware nodes in the first subset based on the weighted edge and vertex independent paths to represent expected resiliency of each hardware node by determining the number of edge and vertex independent paths derived from each hardware node and the estimated probability that each of said paths will fail.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A set of data is obtained, representing a graph of a computer network having a set of hardware nodes and a set of hardware links between the hardware nodes. The hardware links are represented as edges in the graph. A first subset (for example, a vertex cut set) of the set of hardware nodes is found, such that those of the hardware nodes in the first subset are able to withstand a maximum number of failures before the graph disconnects. The failures include node failures and/or edge failures. The hardware nodes in the first subset are ranked based on expected resiliency, to obtain a ranked list. Optionally, in case of a tie between two or more of the hardware nodes in the ranked list, the tie is broken using a sum of shortest path metric.

16 Citations

View as Search Results

20 Claims

1. A method comprising:
- obtaining a set of data representing a graph of a computer network having a set of hardware nodes and a set of hardware links between the hardware nodes, the hardware links being represented as edges in the graph;
  
  finding a first subset of the set of hardware nodes, such that those of the hardware nodes in the first subset are able to withstand a maximum number of failures before the graph disconnects, the failures comprising at least one of node failures and edge failures; and
  
  ranking the hardware nodes in the first subset based on expected resiliency, to obtain a ranked list;
  
  wherein said expected resiliency is computed via E[R_m(v)]=Σ
  
  _fε
  
  2_EP(f)N(v, f), wherein;
  
  E represents a set of edges among the first subset of hardware nodes f;
  
  P(f) represents a probability of all edges associated with the first subset f failing together;
  
  R_m(v) represents a resiliency measure of a service deployed at a given node v; and
  
  N(v, f) represents the number of nodes that can be reached from the given node v if all edges in the first subset f fail together; and
  
  wherein said ranking comprises;
  
  identifying edge and vertex independent paths from each hardware node in the first subset to all other hardware nodes;
  
  weighting each of the edge and vertex independent paths with the estimated failure probability of the vertex and the edges in each independent path; and
  
  ranking the hardware nodes in the first subset based on the weighted edge and vertex independent paths to represent expected resiliency of each hardware node by determining the number of edge and vertex independent paths derived from each hardware node and the estimated probability that each of said paths will fail.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The method of claim 1, further comprising, in case of a tie between two or more of the hardware nodes in the ranked list, breaking the tie with a sum of shortest path metric.
  - 3. The method of claim 1, further comprising storing the ranked list in a tangible computer-readable recordable storage medium.
  - 4. The method of claim 1, further comprising displaying the ranked list to a human subject on a display device.
  - 5. The method of claim 1, further comprising physically locating a network service on a preferred one of the hardware nodes from the ranked list by loading hardware processor-executable program code, embodying the network service, onto a tangible computer-readable recordable storage medium associated with the preferred hardware node.
  - 6. The method of claim 1, wherein:
    - the step of finding the first subset comprises finding a vertex cut-set of the hardware nodes represented by the graph; and
      
      the step of ranking the hardware nodes in the first subset further comprising;
      
      using Menger'"'"'s theorem to transform the graph and find edge and vertex independent paths from each hardware node in the first subset to all other hardware nodes; and
      
      employing a tournament method to rank the hardware nodes in the vertex cut set, based upon the weighting step.
  - 7. The method of claim 5, wherein the service comprises a network-monitoring service for monitoring the network.
  - 8. The method of claim 7, wherein the network-monitoring service functions by periodically polling all of the hardware nodes in the network, other than the preferred hardware node on which the monitoring service is located, to obtain status, and wherein the preferred hardware node is selected to maximize resiliency of the monitoring service to the node failures and the edge failures.
  - 9. The method of claim 8, wherein the preferred hardware node is selected based upon assuming that all of the hardware nodes and all of the edges representing the hardware links are equally likely to fail.
  - 10. The method of claim 1, wherein the ranked list is ranked based on expected resiliency to be obtained by locating a network service on each given hardware node in the ranked list.
  - 11. The method of claim 2, further comprising providing a system, wherein the system comprises distinct software modules, each of the distinct software modules being embodied on a tangible computer-readable recordable storage medium, and wherein the distinct software modules comprise a vertex cut finder module, an edge independent path finder module, a path weighting engine module, a network fault probability estimation engine module, a tournament method based resiliency ranking engine module, and a network distance based tie breaker engine module;
    - wherein;
      
      the finding of the first subset of the set of hardware nodes is carried out by the vertex cut finder module, implemented on at least one hardware processor;
      
      the ranking of the hardware nodes in the first subset based on the expected resiliency is carried out by the edge independent path finder module, the path weighting engine module, the network fault probability estimation engine module, and the tournament method based resiliency ranking engine module, implemented on the at least one hardware processor; and
      
      the breaking of the tie is carried out by the network distance based tie breaker engine module, implemented on the at least one hardware processor.

12. A computer program product comprising a tangible non-transitory computer readable recordable storage medium including computer usable program code, the computer program product including:
- computer usable program code for obtaining a set of data representing a graph of a computer network having a set of hardware nodes and a set of hardware links between the hardware nodes, the hardware links being represented as edges in the graph;
  
  computer usable program code for finding a first subset of the set of hardware nodes, such that those of the hardware nodes in the first subset are able to withstand a maximum number of failures before the graph disconnects, the failures comprising at least one of node failures and edge failures; and
  
  computer usable program code for ranking the hardware nodes in the first subset based on expected resiliency, to obtain a ranked list;
  
  wherein said expected resiliency is computed via E[R_m(V)]=Σ
  
  _fε
  
  2_EP(f)N(v, f), wherein;
  
  E represents a set of edges among the first subset of hardware nodes f;
  
  P(f) represents a probability of all edges associated with the first subset f failing together;
  
  R_m(v) represents a resiliency measure of a service deployed at a given node v; and
  
  N(v, f) represents the number of nodes that can be reached from the given node v if all edges in the first subset f fail together; and
  
  wherein said ranking comprises;
  
  identifying edge and vertex independent paths from each hardware node in the first subset to all other hardware nodes;
  
  weighting each of the edge and vertex independent paths with the estimated failure probability of the vertex and the edges in each independent path; and
  
  ranking the hardware nodes in the first subset based on the weighted edge and vertex independent paths to represent expected resiliency of each hardware node by determining the number of edge and vertex independent paths derived from each hardware node and the estimated probability that each of said paths will fail.
- View Dependent Claims (13, 14, 15)
- - 13. The computer program product of claim 12, further comprising computer usable program code for, in case of a tie between two or more of the hardware nodes in the ranked list, breaking the tie with a sum of shortest path metric.
  - 14. The computer program product of claim 12, wherein:
    - the computer usable program code for finding the first subset comprises computer usable program code for finding a vertex cut-set of the hardware nodes represented by the graph; and
      
      the computer usable program code for ranking the hardware nodes in the first subset further comprising;
      
      computer usable program code for using Menger'"'"'s theorem to transform the graph and find edge and vertex independent paths from each hardware node in the first subset to all other hardware nodes; and
      
      computer usable program code for employing a tournament method to rank the hardware nodes in the vertex cut set, based upon the weighting step.
  - 15. The computer program product of claim 13, further comprising distinct software modules, each of the distinct software modules being embodied on the tangible computer-readable recordable storage medium, the distinct software modules comprising a vertex cut finder module, an edge independent path finder module, a path weighting engine module, a network fault probability estimation engine module, a tournament method based resiliency ranking engine module, and a network distance based tie breaker engine module;
    - wherein;
      
      the vertex cut finder module comprises the computer usable program code for finding of the first subset of the set of hardware;
      
      the edge independent path finder module, the path weighting engine module, the network fault probability estimation engine module, and the tournament method based resiliency ranking engine module comprise the computer usable program code for the ranking of the hardware nodes in the first subset based on the expected resiliency; and
      
      the network distance based tie breaker engine module comprises the computer usable program code for the breaking of the tie.

16. An apparatus comprising:
- a memory; and
  
  at least one processor, coupled to the memory, and operative to;
  
  obtain a set of data representing a graph of a computer network having a set of hardware nodes and a set of hardware links between the hardware nodes, the hardware links being represented as edges in the graph;
  
  find a first subset of the set of hardware nodes, such that those of the hardware nodes in the first subset are able to withstand a maximum number of failures before the graph disconnects, the failures comprising at least one of node failures and edge failures; and
  
  rank the hardware nodes in the first subset based on expected resiliency, to obtain a ranked list;
  
  wherein said expected resiliency is computed via E[R_m(v)]=Σ
  
  _fε
  
  2_EP(f)N(v, f), wherein;
  
  E represents a set of edges among the first subset of hardware nodes f;
  
  P(f) represents a probability of all edges associated with the first subset f failing together;
  
  R_m(v) represents a resiliency measure of a service deployed at a given node v; and
  
  N(v, f) represents the number of nodes that can be reached from the given node v if all edges in the first subset f fail together; and
  
  wherein said ranking comprises;
  
  identifying edge and vertex independent paths from each hardware node in the first subset to all other hardware nodes;
  
  weighting each of the edge and vertex independent paths with the estimated failure probability of the vertex and the edges in each independent path; and
  
  ranking the hardware nodes in the first subset based on the weighted edge and vertex independent paths to represent expected resiliency of each hardware node by determining the number of edge and vertex independent paths derived from each hardware node and the estimated probability that each of said paths will fail.
- View Dependent Claims (17, 18, 19)
- - 17. The apparatus of claim 16, wherein the at least one processor is further operative, in case of a tie between two or more of the hardware nodes in the ranked list, to break the tie with a sum of shortest path metric.
  - 18. The apparatus of claim 16, wherein the at least one processor is operative to:
    - find the first subset by finding a vertex cut-set of the hardware nodes represented by the graph; and
      
      rank the hardware nodes in the first subset by;
      
      using Menger'"'"'s theorem to transform the graph and find edge and vertex independent paths from each hardware node in the first subset to all other hardware nodes; and
      
      employing a tournament method to rank the hardware nodes in the vertex cut set, based upon the weighting step.
  - 19. The apparatus of claim 17, further comprising a tangible computer-readable recordable storage medium having distinct software modules embodied thereon, wherein the distinct software modules comprise a vertex cut finder module, an edge independent path finder module, a path weighting engine module, a network fault probability estimation engine module, a tournament method based resiliency ranking engine module, and a network distance based tie breaker engine module;
    - wherein;
      
      the finding of the first subset of the set of hardware nodes is carried out by the vertex cut finder module, implemented on the at least one processor;
      
      the ranking of the hardware nodes in the first subset based on the expected resiliency is carried out by the edge independent path finder module, the path weighting engine module, the network fault probability estimation engine module, and the tournament method based resiliency ranking engine module, implemented on the at least one processor; and
      
      the breaking of the tie is carried out by the network distance based tie breaker engine module, implemented on the at least one processor.

20. An apparatus comprising:
- means for obtaining a set of data representing a graph of a computer network having a set of hardware nodes and a set of hardware links between the hardware nodes, the hardware links being represented as edges in the graph;
  
  means for finding a first subset of the set of hardware nodes, such that those of the hardware nodes in the first subset are able to withstand a maximum number of failures before the graph disconnects, the failures comprising at least one of node failures and edge failures; and
  
  means for ranking the hardware nodes in the first subset based on expected resiliency, to obtain a ranked list;
  
  wherein said expected resiliency is computed via E[R_m(v)]=Σ
  
  _fε
  
  2_EP(f)N(v, f), wherein;
  
  E represents a set of edges among the first subset of hardware nodes f;
  
  P(f) represents a probability of all edges associated with the first subset f failing together;
  
  R_m(v) represents a resiliency measure of a service deployed at a given node v; and
  
  N(v, f) represents the number of nodes that can be reached from the given node v if all edges in the first subset f fail together; and
  
  wherein said ranking comprises;
  
  identifying edge and vertex independent paths from each hardware node in the first subset to all other hardware nodes;
  
  weighting each of the edge and vertex independent paths with the estimated failure probability of the vertex and the edges in each independent path; and
  
  ranking the hardware nodes in the first subset based on the weighted edge and vertex independent paths to represent expected resiliency of each hardware node by determining the number of edge and vertex independent paths derived from each hardware node and the estimated probability that each of said paths will fail.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
International Business Machines Corporation
Original Assignee
International Business Machines Corporation
Inventors
Banerjee, Dipyaman, Srivatsa, Mudhakar, Madduri, Venkateswara R
Primary Examiner(s)
Pesin, Boris
Assistant Examiner(s)
GREENE, SABRINA LETICIA

Application Number

US12/493,806
Publication Number

US 20100332991A1
Time in Patent Office

1,940 Days
Field of Search

715/736, 715/734, 715/733
US Class Current

715/736
CPC Class Codes

H04L 41/12   Discovery or management of ...

H04L 41/22   comprising specially adapte...

H04L 43/10   Active monitoring, e.g. hea...

Increasing resilience of a network service

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

16 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Increasing resilience of a network service

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

16 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links