Stateless redundancy in a network device
First Claim
1. A method to provide fault tolerance, comprising:
- identifying a first control element as a primary control element;
identifying a second control element as a secondary control element;
synchronizing information between said primary and secondary control elements, wherein said synchronizing comprises receiving said information from a forwarding element at both said primary and secondary control elements, wherein said information comprises data packets and forwarding element state information;
detecting a failure condition in said primary control element; and
identifying said secondary element as said primary control element in case of said failure condition;
wherein said detecting comprises communicating status messages between said forwarding element and said primary control element on a periodic basis, determining whether each status message is received within a predetermined interval, and identifying said failure condition in accordance with said determination.
1 Assignment
0 Petitions
Accused Products
Abstract
A network device may be configured with a primary control element and a second control element. Both the primary and secondary control elements may receive information from one or more forwarding elements. Both the primary and secondary control elements may send control information to the forwarding element. The control information from the secondary control element, however, may be discarded either at the forwarding element or at a blocking agent between the secondary control element and the forwarding element. In this manner, the secondary control element may be synchronized with the primary control element in a stateless manner. If a failure condition is detected, control plane operations may be performed using the secondary control element. As a result, the network device may become fault tolerant, and experience reduced amounts of down time due to failure of one or more elements within the network device.
-
Citations
15 Claims
-
1. A method to provide fault tolerance, comprising:
-
identifying a first control element as a primary control element; identifying a second control element as a secondary control element; synchronizing information between said primary and secondary control elements, wherein said synchronizing comprises receiving said information from a forwarding element at both said primary and secondary control elements, wherein said information comprises data packets and forwarding element state information; detecting a failure condition in said primary control element; and identifying said secondary element as said primary control element in case of said failure condition; wherein said detecting comprises communicating status messages between said forwarding element and said primary control element on a periodic basis, determining whether each status message is received within a predetermined interval, and identifying said failure condition in accordance with said determination. - View Dependent Claims (2, 3, 4, 5)
-
-
6. An apparatus, comprising:
-
a plurality of forwarding elements; a primary control element; a secondary control element; a backplane connected to said forwarding elements, said primary control element and said secondary control element, to communicate information between said elements; and a fault tolerant management (FTM) module to synchronize information between said primary control element and said secondary control element, and to identify said secondary control element as said primary control element if said primary control element fails; wherein said synchronized information comprises data packets and forwarding element state information; and
.wherein said FTM module further comprises a fault detection (FD) module to detect a failure condition for said primary control element, said detection comprising communicating status messages between said forwarding element and said primary control element on a periodic basis, determining whether each status message is received within a predetermined interval, and identifying said failure condition in accordance with said determination. - View Dependent Claims (7, 8, 9)
-
-
10. An article comprising:
-
a storage medium; said storage medium including stored instructions that, when executed by a processor, result in providing fault tolerance by identifying a first control element as a primary control element, identifying a second control element as a secondary control element, synchronizing information between said primary and secondary control elements, wherein the stored instructions, when executed by a processor, further result in said synchronizing by receiving said information from a forwarding element at both said primary and secondary control elements, detecting a failure condition in said primary control element, and identifying said secondary element as said primary control element in case of said failure condition; wherein said information comprises data packets and forwarding element state information; and wherein the stored instructions, when executed by the processor, further result said detecting by communicating status messages between said forwarding element and said primary control element on a periodic basis, determining whether each status message is received within a predetermined interval, and identifying said failure condition in accordance with said determination. - View Dependent Claims (11, 12)
-
-
13. A system, comprising:
-
a computing platform to provide fault tolerance; said platform being further adapted to identify a first control element as a primary control element, identify a second control element as a secondary control element, synchronize information between said primary and secondary control elements, detect a failure condition in said primary control element, and identify said secondary element as said primary control element in case of said failure condition, wherein said platform is further adapted to perform said synchronizing by receiving said information from a forwarding element at both said primary and secondary control elements, said information comprising data packets and forwarding element state information; wherein said platform is further adapted to detect said failure condition by communicating status messages between said forwarding element and said primary control element on a periodic basis, determining whether each status message is received within a predetermined interval, and identifying said failure condition in accordance with said determination. - View Dependent Claims (14, 15)
-
Specification