Recovering from failures without impact on data traffic in a shared bus architecture
First Claim
1. A method for error recovery in a network device, the method comprising:
- storing, in a memory associated with a host processor of the network device, a set of data structures used for facilitating communication between the host processor and a plurality of packet processors of the network device;
forwarding, by a first packet processor from the plurality of packet processors, one or more packets received by the network device using forwarding information programmed by the host processor into a memory accessed by the first packet processor;
detecting, by the host processor, an error condition indicative of a communication error between the host processor and a packet processor from the plurality of packet processors;
in response to detection of the error condition;
identifying, by the host processor, from the plurality of packet processors, the first packet processor affected by the error condition; and
performing, by the host processor, a set of recovery actions for recovering from the error condition, the performing comprisingdisabling communication between the host processor and the first packet processor, andrestoring, by the host processor, to an initial state a data structure from the set of data structures used for communication between the host processor and the first packet processor; and
while the set of actions is being performed, forwarding, by the first packet processor, at least one packet received by the network device using the forwarding information programmed into the memory accessed by the first packet processor prior to detecting the error condition.
17 Assignments
0 Petitions
Accused Products
Abstract
Methods of detecting and recovering from communication failures within an operating network switching device that is switching packets in a communication network, and associated structures. The communication failures addressed involve communications between the packet processors and a host CPU over a shared communications bus, e.g., PCI bus. The affected packet processor(s)—which may be all or a subset of the packet processors of the network switch—may be recovered without affecting hardware packet forwarding through the affected packet processors. This maximizes the up time of the network switching device. Other packet processor(s), if any, of the network switching device, which are not affected by the communication failure, may continue their normal packet forwarding, i.e., hardware forwarding that does not involve communications with the host CPU as well as forwarding or other operations that do involve communications with the host CPU.
-
Citations
18 Claims
-
1. A method for error recovery in a network device, the method comprising:
-
storing, in a memory associated with a host processor of the network device, a set of data structures used for facilitating communication between the host processor and a plurality of packet processors of the network device; forwarding, by a first packet processor from the plurality of packet processors, one or more packets received by the network device using forwarding information programmed by the host processor into a memory accessed by the first packet processor; detecting, by the host processor, an error condition indicative of a communication error between the host processor and a packet processor from the plurality of packet processors; in response to detection of the error condition; identifying, by the host processor, from the plurality of packet processors, the first packet processor affected by the error condition; and performing, by the host processor, a set of recovery actions for recovering from the error condition, the performing comprising disabling communication between the host processor and the first packet processor, and restoring, by the host processor, to an initial state a data structure from the set of data structures used for communication between the host processor and the first packet processor; and while the set of actions is being performed, forwarding, by the first packet processor, at least one packet received by the network device using the forwarding information programmed into the memory accessed by the first packet processor prior to detecting the error condition. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A network switching device comprising:
-
a control processor with an associated memory; and a plurality of packet processors, including a first packet processor configured to forward one or more data packets received by the network switching device using forwarding information programmed by the control processor into a memory accessed by the first packet processor; said control processor configured to; store, in the associated memory, a set of data structures used for facilitating communication between the control processor and the plurality of packet processors; in response to detection of an error condition relating to communication between said control processor and one or more of said plurality of packet processors, identify the first packet processor affected by the error condition and perform a set of recovery actions for recovering from the error condition, the performing including disabling communication activity with the first packet processor, and restoring to an initial state a data structure from the set of data structures used for facilitating communication between the control processor and the first packet processor; said first packet processor configured to; while the set of actions is being performed by the control processor, forward at least one data packet received by the network switching device using the forwarding information programmed into the memory accessed by the first packet processor prior to detection of the error condition. - View Dependent Claims (15, 16, 17, 18)
-
Specification