Recovering from failures without impact on data traffic in a shared bus architecture
First Claim
1. A method in a network device, the method comprising:
- storing, in a memory associated with a host processor of the network device, a set of data structures used for transferring data, on a shared bus, between the host processor and a plurality of packet processors of the network device, each packet processor in the plurality of packet processors configured to forward, from the network device, one or more packets received by the network device;
detecting, by the host processor, an error condition indicative of a communication error between the host processor and a first packet processor from the plurality of packet processors;
in response to detection of the error condition;
identifying, by the host processor, from the plurality of packet processors, the first packet processor affected by the error condition; and
performing, by the host processor, a set of recovery actions for recovering from the error condition, the set of recovery actions including disabling communication between the host processor and the first packet processor; and
while the set of recovery actions is being performed, communicating data on the shared bus, between the host processor and at least one packet processor from the plurality of packet processors other than the first packet processor and forwarding, by the first packet processor at least one packet received by the network device using forwarding information programmed prior to the host processor detecting the error condition.
12 Assignments
0 Petitions
Accused Products
Abstract
Methods of detecting and recovering from communication failures within an operating network switching device that is switching packets in a communication network, and associated structures. The communication failures addressed involve communications between the packet processors and a host CPU over a shared communications bus, e.g., PCI bus. The affected packet processor(s)—which may be all or a subset of the packet processors of the network switch—may be recovered without affecting hardware packet forwarding through the affected packet processors. This maximizes the up time of the network switching device. Other packet processor(s), if any, of the network switching device, which are not affected by the communication failure, may continue their normal packet forwarding, i.e., hardware forwarding that does not involve communications with the host CPU as well as forwarding or other operations that do involve communications with the host CPU.
-
Citations
22 Claims
-
1. A method in a network device, the method comprising:
-
storing, in a memory associated with a host processor of the network device, a set of data structures used for transferring data, on a shared bus, between the host processor and a plurality of packet processors of the network device, each packet processor in the plurality of packet processors configured to forward, from the network device, one or more packets received by the network device; detecting, by the host processor, an error condition indicative of a communication error between the host processor and a first packet processor from the plurality of packet processors; in response to detection of the error condition; identifying, by the host processor, from the plurality of packet processors, the first packet processor affected by the error condition; and performing, by the host processor, a set of recovery actions for recovering from the error condition, the set of recovery actions including disabling communication between the host processor and the first packet processor; and while the set of recovery actions is being performed, communicating data on the shared bus, between the host processor and at least one packet processor from the plurality of packet processors other than the first packet processor and forwarding, by the first packet processor at least one packet received by the network device using forwarding information programmed prior to the host processor detecting the error condition. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A network device comprising:
-
a control processor with an associated memory; and a plurality of packet processors, each packet processor in the plurality of packet processors configured to forward, from the network device, one or more packets received by the network device; the control processor configured to; store, in the associated memory, a set of data structures used for transferring data, on a shared bus, between the control processor and the plurality of packet processors; in response to detection of an error condition relating to communication between the control processor and a first packet processor from the plurality of packet processors, identify from the plurality of packet processors, the first packet processor affected by the error condition and perform a set of recovery actions for recovering from the error condition, the set of recovery actions including disabling communication with the first packet processor; the first packet processor from the plurality of packet processors configured to; while the set of recovery actions is being performed by the control processor, forward at least one packet received by the network device using forwarding information programmed prior to the host processor detecting the error condition; and a second packet processor from the plurality of packet processors configured to; while the set of recovery actions is being performed by the control processor, continue to communicate on the shared bus with the control processor. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22)
-
Specification