Dynamic generation of deadlock-free routings
First Claim
Patent Images
1. A method in a network with network components including a plurality of nodes that send packets using predefined routings, the method performed by a local one of the plurality of nodes comprising:
- detecting when a failure of one of the network components occurs;
buffering packets that originate from the local node and that travel through the local node responsive to detecting the failure;
while the packets are being buffered, constructing initial routings for the plurality of nodes that avoid the failed network component and that avoid deadlock;
attempting to improve the initial routings such that improved initial routings are generated by performing the following steps for at least one pair of the plurality of nodes;
identifying an additional path between the pair of nodes with a shorter length than the routing for the pair of nodes in the initial routings, determining whether the additional path subjects the network to deadlock, and adding the additional path as a routing between the pair of nodes in the improved initial routings when it is determined that the additional path does not subject the network to deadlock, updating the plurality of the nodes with the improved initial routings; and
routing the buffered packets using the improved initial routings.
2 Assignments
0 Petitions
Accused Products
Abstract
In accordance with methods and systems consistent with the present invention, an improved failure recovery system is provided that, upon detecting a failure, generates new routings for the network which (1) avoid the failure and (2) avoid deadlock. In this manner, after a failure is detected, the network remains as operational as possible while still avoiding deadlock. Thus, by using the improved failure recovery system, a network failure has a much less severe impact on the network than in conventional systems.
69 Citations
26 Claims
-
1. A method in a network with network components including a plurality of nodes that send packets using predefined routings, the method performed by a local one of the plurality of nodes comprising:
-
detecting when a failure of one of the network components occurs;
buffering packets that originate from the local node and that travel through the local node responsive to detecting the failure;
while the packets are being buffered, constructing initial routings for the plurality of nodes that avoid the failed network component and that avoid deadlock;
attempting to improve the initial routings such that improved initial routings are generated by performing the following steps for at least one pair of the plurality of nodes;
identifying an additional path between the pair of nodes with a shorter length than the routing for the pair of nodes in the initial routings, determining whether the additional path subjects the network to deadlock, and adding the additional path as a routing between the pair of nodes in the improved initial routings when it is determined that the additional path does not subject the network to deadlock, updating the plurality of the nodes with the improved initial routings; and
routing the buffered packets using the improved initial routings. - View Dependent Claims (2, 3)
determining whether the initial routings contain a routing for the pair of nodes; and
identifying a path between the pair of nodes, determining whether the identified path subjects the network to deadlock, and adding the identified path as a routing between the pair of nodes in the improved initial routings when it is determined that the identified path does not subject the network to deadlock and when it is determined that the initial routings do not contain the routing for the pair of nodes.
-
-
3. The method of claim 1, wherein the network components include a plurality of links, and wherein the constructing step includes:
-
labeling the links with a numerical value; and
determining an initial routing for each pair of nodes such that when the initial routing traverses more than one link, the labels in the traversed link follow a predefined sequence.
-
-
4. A method in a network with network components including a plurality of nodes that send packets using predefined routings, the method performed by a local one of the plurality of nodes comprising:
-
detecting when a failure of one of the network components occurs;
buffering packets that originate from the local node and that travel through the local node responsive to detecting the failure;
while the packets are being buffered, constructing initial routings for the plurality of nodes that avoid the failed network component and that avoid deadlock;
attempting to improve the initial routings such that improved initial routings are generated by performing the following steps for at least one pair of the plurality of nodes;
identifying an additional path between the pair of nodes with a shorter length than the routing for the pair of nodes in the initial routings, determining whether the additional path subjects the network to deadlock by;
determining a cycle produced by the additional path when it is determined that the additional path subjects the network to deadlock, the cycle comprising paths that form the cycle;
replacing in the improved initial routings one of the paths that form the cycle other than the additional path with a new path; and
determining whether the improved initial routings with the new path subjects the network to deadlock;
adding the additional path as a routing between the pair of nodes in the improved initial routings when it is determined that the additional path does not subject the network to deadlock, updating the plurality of the nodes with the improved initial routings; and
routing the buffered packets using the improved initial routings.
-
-
5. A method in a distributed system containing nodes interconnected via links, each node containing a routing table with routings for routing traffic, the method comprising:
-
initiating operation of the distributed system such that the traffic is routed through the nodes using the routings contained in the routing tables; and
while the distributed system remains operational, generating, by one of the nodes, new routings for the routing tables that avoid deadlock; and
updating, by the one node, the routing tables to utilize the new routings. - View Dependent Claims (6)
generating the new routings responsive to detection of a network failure.
-
-
7. A method in a data processing system having a network with nodes and links interconnecting the nodes, comprising the steps of:
-
assigning numerical values to each of the links;
identifying, for each pair of the nodes, a path through the network that traverses at least one of the links;
determining whether the numerical values for each link of each path that traverses more than one link follows a predefined sequence; and
determining that routing traffic through the network using the paths avoids deadlock when it is determined that the numerical values for each link of each path that traverses more than one link follows the predefined sequence. - View Dependent Claims (8, 9)
determining whether the numerical values for each link of each path that traverses more than one link form the increasing numerical order.
-
-
9. The method of claim 8, wherein at least two of the nodes are partner nodes connected by a partner link, wherein the numerical values range up to a highest numerical value, and wherein the step of assigning a numerical value includes:
assigning the highest numerical value to the partner link.
-
10. A method in a data processing system with a network having nodes and routings for routing traffic between a plurality of the nodes, wherein the routings do not include a routing for traffic between a first of the nodes and a second of the nodes, comprising the steps of:
-
identifying a proposed route for routing traffic between the first node and the second node;
determining whether adding the proposed route to the routings subjects the network to deadlock;
adding the proposed route to the routings when it is determined that the addition of the proposed route to the routings does not subject the network to deadlock;
identifying a cycle created by the proposed route when it is determined that the addition of the proposed route to the routings subjects the network to deadlock, wherein the cycle comprises the proposed route and other routes that form the cycle; and
replacing one of the other routes with a new route to avoid deadlock. - View Dependent Claims (11)
determining whether replacing the one of the other routes with the new route subjects the network to deadlock; and
replacing the one other route with the new route when it is determined that the replacing of the one other route with the new route does not subject the network to deadlock.
-
-
12. A method in a data processing system with a network having nodes and links interconnecting the nodes, the data processing system having routings for routing traffic between a plurality of the nodes, wherein the routings do not include a routing for traffic between a first of the nodes and a second of the nodes, comprising the steps of:
-
adding a new routing to the routings for transferring traffic between the first node and the second node;
determining that the routings subject the network to deadlock; and
replacing the new routing with a second new routing to render the network deadlock free;
identifying a cycle created by the second new routing when it is determined that replacing the new routing with the second new routing subjects the network to deadlock, wherein the cycle comprises the second new routing and other routings that form the cycle; and
replacing one of the other routings with a third new routing to avoid deadlock. - View Dependent Claims (13)
replacing one of the cycle-forming routings to destroy the cycle.
-
-
14. A distributed system with a plurality of nodes that route traffic using routings in routing tables stored in each of the nodes, comprising:
-
a failure recovery system included in one of the plurality of nodes, comprising;
a memory containing a failure recovery program having code that detects when a network failure occurs, having code that generates new routings for the routing tables that avoid the network failure and that avoid deadlock when the network failure occurs, and having code that updates the routings in the routing tables with the new routings so that the network failure is avoided and deadlock is avoided; and
a processor for running the failure recovery program.
-
-
15. A computer-readable medium containing instructions for controlling a distributed system to perform a method, the distributed system containing nodes interconnected via links, each node containing a routing table with routings for routing traffic, the method comprising:
-
initiating operation of the distributed system such that the traffic is routed through the nodes using the routings contained in the routing tables; and
while the distributed system remains operational, generating, by one of the nodes, new routings for the routing tables that avoid deadlock; and
updating, by one of the nodes, the routing tables to utilize the new routings. - View Dependent Claims (16)
generating the new routings responsive to detection of a network failure.
-
-
17. A computer-readable medium containing instructions for controlling a data processing system to perform a method, the data processing system having a network with nodes and links interconnecting the nodes, the method comprising the steps of:
-
assigning numerical values to each of the links;
identifying, for each pair of the nodes, a path through the network that traverses at least one of the links;
determining whether the numerical values for each link of each path that traverses more than one link follows a predefined sequence; and
determining that routing traffic through the network using the paths avoids deadlock when it is determined that the numerical values for each link of each path that traverses more than one link follows the predefined sequence. - View Dependent Claims (18, 19)
determining whether the numerical values for each link of each path that traverses more than one link form the increasing numerical order.
-
-
19. The computer-readable medium of claim 18, wherein at least two of the nodes are partner nodes connected by a partner link, wherein the numerical values range up to a highest numerical value and wherein the step of assigning a numerical value includes:
assigning the highest numerical value to the partner link.
-
20. A computer-readable medium containing instructions for controlling a data processing system to perform a method, the data processing system having a network with nodes and routings for routing traffic between a plurality of the nodes, wherein the routings do not include a routing for traffic between a first of the nodes and a second of the nodes, the method comprising the steps of:
-
identifying a proposed route for routing traffic between the first node and the second node;
determining whether adding the proposed route to the routings subjects the network to deadlock;
adding the proposed route to the routings when it is determined that the addition of the proposed route to the routings does not subject the network to deadlock;
identifying a cycle created by the proposed route when it is determined that the addition of the proposed route to the routings subjects the network to deadlock, wherein the cycle comprises the proposed route and other routes that form the cycle; and
replacing one of the other routes with a new route to avoid deadlock. - View Dependent Claims (21)
determining whether replacing the one of the other routes with the new route subjects the network to deadlock; and
replacing the one other route with the new route when it is determined that the replacing of the one other route with the new route does not subject the network to deadlock.
-
-
22. A computer-readable medium containing instructions for controlling a data processing system to perform a method, the data processing system having a network with nodes and links interconnecting the nodes, the data processing system having routings for routing traffic between a plurality of the nodes, wherein the routings do not include a routing for traffic between a first of the nodes and a second of the nodes, the method comprising the steps of:
-
adding a new routing to the routings for transferring traffic between the first node and the second node;
determining that the routings subject the network to deadlock;
replacing the new routing with a second new routing to render the network deadlock free;
identifying a cycle created by the second new routing when it is determined that replacing the new routing with the second new routing subjects the network to deadlock, wherein the cycle comprises the second new routing and other routings that form the cycle; and
replacing one of the other routings with a third new routing to avoid deadlock. - View Dependent Claims (23)
replacing one of the cycle-forming routings to destroy the cycle.
-
-
24. A method in a network with network components including a plurality of nodes that send packets using predefined routings, the method performed by a local one of the plurality of nodes comprising:
-
detecting when a failure of one of the network components occurs;
buffering packets that originate from the local node and that travel through the local node responsive to detecting the failure;
while the packets are being buffered, constructing initial routings for the plurality of nodes that avoid the failed network component and that avoid deadlock;
attempting to improve the initial routings such that improved initial routings are generated by performing the following steps for at least one pair of the plurality of nodes;
identifying an additional path between the pair of nodes with a shorter length than the routing for the pair of nodes in the initial routings, determining whether the additional path subjects the network to deadlock including the steps of;
determining a cycle produced by the additional path when it is determined that the additional path subjects the network to deadlock, the cycle comprising paths that form the cycle, replacing in the improved initial routings one of the paths that form the cycle other than the additional path with a new path, and determining, following the replacing step, whether the improved initial routings subjects the network to deadlock;
adding the additional path as a routing between the pair of nodes in the improved initial routings when it is determined that the additional path does not subject the network to deadlock;
updating the plurality of the nodes with the improved initial routings; and
routings the buffered packets using the improved initial routings.
-
-
25. A method in a data processing system with a network having nodes and routings for routing traffic between a plurality of the nodes, wherein the routings do not include a routing for traffic between a first of the nodes and a second of the nodes, comprising the steps of:
-
identifying a proposed route for routing traffic between the first node and the second node;
determining whether adding the proposed route to the routings subjects the network to deadlock;
adding the proposed route to the routings when it is determined that the addition of the proposed route to the routings does not subject the network to deadlock;
identifying a cycle created by the proposed route when it is determined that the addition of the proposed route to the routings subjects the network to deadlock, wherein the cycle comprises the proposed route and other routes that form the cycle; and
replacing one of the other routes with a new route to avoid deadlock.
-
-
26. A computer-readable medium containing instructions for controlling a data processing system to perform a method, the data processing system having a network with nodes and routings for routing traffic between a plurality of the nodes, wherein the routings do not include a routing for traffic between a first of the nodes and a second of the nodes, the method comprising the steps of:
-
identifying a proposed route for routing traffic between the first node and the second node;
determining whether adding the proposed route to the routings subjects the network to deadlock;
adding the proposed route to the routings when it is determined that the addition of the proposed route to the routings does not subject the network to deadlock;
identifying a cycle created by the proposed route when it is determined that the addition of the proposed route to the routings subjects the network to deadlock, wherein the cycle comprises the proposed route and other routes that form the cycle; and
replacing one of the other routes with a new route to avoid deadlock.
-
Specification