Hybrid address mutex mechanism for memory accesses in a network processor
First Claim
1. A method of arbitrating access to a level one (L1) cache of a network processor implemented as an integrated circuit, the network processor having a plurality of processing modules and at least one shared memory, the processing modules coupled to at least one unidirectional ring bus, the method comprising:
- defining, by the network processor, one or more virtual pipelines of the plurality of processing modules through the at least one unidirectional ring bus, each virtual pipeline defining a processing order of packet data received by the network processor through two or more of the plurality of processing modules, each virtual pipeline identified by a virtual pipeline identifier, wherein the defining accounts for non-sequential processing of the packet data;
sending, by a source one of the plurality of processing modules, a task over the at least one unidirectional ring bus to an adjacent processing module coupled to the ring bus, the task corresponding to the received data packet and having a corresponding virtual pipeline identifier, the task including shared parameters that point to a shared parameter table stored in a system memory, wherein the sending the task includes task enqueue, scheduling and dequeue operations based on the shared parameters;
iteratively;
determining, by the adjacent processing module, based on the virtual pipeline identifier, whether the processing module is a destination processing module of the virtual pipeline associated with the task and, if not, passing the task unchanged to a next adjacent processing module coupled to the ring bus, thereby passing the task from the source processing module to each corresponding destination processing module on the ring bus;
generating, by each destination processing module, one or more memory access requests, each memory access request comprising a requested address and an ID value corresponding to the requesting processing module, wherein each memory access request comprises one of a locked access request and one or more simple access requests;
determining, by an address mutually exclusive (mutex) arbiter of the network processor, whether one or more received memory access requests are simple access requests or locked access requests, wherein, for each locked access request, the address mutex arbiter performs the steps of;
determining, by the address mutex arbiter, whether two or more of the memory access requests are either conflicted or non-conflicted based on the requested address of each of the one or more memory access requests;
if one or more of the memory access requests are non-conflicted, determining, by the address mutex arbiter, for each non-conflicted memory access request, whether the requested address of each non-conflicted memory access request is locked out by one or more prior memory access requests based on a lock table of the address mutex arbiter;
if one or more of the non-conflicted memory access requests are locked-out by one or more prior memory access requests;
queuing, by the address mutex arbiter, one or more locked-out memory requests;
granting one or more non-conflicted memory access requests that are not locked-out;
updating the lock table corresponding to the requested addresses associated with the one or more granted memory access requests.
5 Assignments
0 Petitions
Accused Products
Abstract
Described embodiments provide arbitration for a cache of a network processor. Processing modules of the network processor generate memory access requests including a requested address and an ID value corresponding to the requesting processing module. Each request is either a locked request or a simple request. An arbiter determines whether the received requests are locked requests. For each locked request, the arbiter determines whether two or more of the requests are conflicted based on the requested address of each received memory requests. If one or more of the requests are non-conflicted, the arbiter determines, for each non-conflicted request, whether the requested addresses are locked out by prior memory requests based on a lock table. If one or more of the non-conflicted memory requests are locked-out by prior memory requests, the arbiter queues the locked-out memory requests. The arbiter grants any non-conflicted memory access requests that are not locked-out.
71 Citations
20 Claims
-
1. A method of arbitrating access to a level one (L1) cache of a network processor implemented as an integrated circuit, the network processor having a plurality of processing modules and at least one shared memory, the processing modules coupled to at least one unidirectional ring bus, the method comprising:
-
defining, by the network processor, one or more virtual pipelines of the plurality of processing modules through the at least one unidirectional ring bus, each virtual pipeline defining a processing order of packet data received by the network processor through two or more of the plurality of processing modules, each virtual pipeline identified by a virtual pipeline identifier, wherein the defining accounts for non-sequential processing of the packet data; sending, by a source one of the plurality of processing modules, a task over the at least one unidirectional ring bus to an adjacent processing module coupled to the ring bus, the task corresponding to the received data packet and having a corresponding virtual pipeline identifier, the task including shared parameters that point to a shared parameter table stored in a system memory, wherein the sending the task includes task enqueue, scheduling and dequeue operations based on the shared parameters; iteratively; determining, by the adjacent processing module, based on the virtual pipeline identifier, whether the processing module is a destination processing module of the virtual pipeline associated with the task and, if not, passing the task unchanged to a next adjacent processing module coupled to the ring bus, thereby passing the task from the source processing module to each corresponding destination processing module on the ring bus; generating, by each destination processing module, one or more memory access requests, each memory access request comprising a requested address and an ID value corresponding to the requesting processing module, wherein each memory access request comprises one of a locked access request and one or more simple access requests; determining, by an address mutually exclusive (mutex) arbiter of the network processor, whether one or more received memory access requests are simple access requests or locked access requests, wherein, for each locked access request, the address mutex arbiter performs the steps of; determining, by the address mutex arbiter, whether two or more of the memory access requests are either conflicted or non-conflicted based on the requested address of each of the one or more memory access requests; if one or more of the memory access requests are non-conflicted, determining, by the address mutex arbiter, for each non-conflicted memory access request, whether the requested address of each non-conflicted memory access request is locked out by one or more prior memory access requests based on a lock table of the address mutex arbiter; if one or more of the non-conflicted memory access requests are locked-out by one or more prior memory access requests; queuing, by the address mutex arbiter, one or more locked-out memory requests; granting one or more non-conflicted memory access requests that are not locked-out; updating the lock table corresponding to the requested addresses associated with the one or more granted memory access requests. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A non-transitory machine-readable storage medium, having encoded thereon program code, wherein, when the program code is executed by a machine, the machine implements a method of arbitrating access to a level one (L1) cache of a network processor implemented as an integrated circuit, the network processor having a plurality of processing modules and at least one shared memory, the processing modules coupled to at least one unidirectional ring bus, the method comprising:
-
defining, by the network processor, one or more virtual pipelines of the plurality of processing modules, each virtual pipeline defining a processing order of packet data received by the network processor through two or more of the plurality of processing modules through the at least one unidirectional ring bus, each virtual pipeline identified by a virtual pipeline identifier, wherein the defining accounts for non-sequential processing of the packet data; sending, by a source one of the plurality of processing modules, a task over the at least one unidirectional ring bus to an adjacent processing module coupled to the ring bus, the task corresponding to the received data packet and having a corresponding virtual pipeline identifier, the task including shared parameters that point to a shared parameter table stored in a system memory, wherein the sending the task includes task enqueue, scheduling and dequeue operations based on the shared parameters; iteratively; determining, by the adjacent processing module, based on the virtual pipeline identifier, whether the processing module is a destination processing module of the virtual pipeline associated with the task and, if not, passing the task unchanged to a next adjacent processing module coupled to the ring bus, thereby passing the task from the source processing module to each corresponding destination processing module on the ring bus; generating, by each destination processing module, one or more memory access requests, each memory access request comprising a requested address and an ID value corresponding to the requesting processing module, wherein each memory access request comprises one of a locked access request and one or more simple access requests; determining, by an address mutually exclusive (mutex) arbiter of the network processor, whether one or more received memory access requests are simple access requests or locked access requests, wherein, for each locked access request, the address mutex arbiter performs the steps of; determining, by the address mutex arbiter, whether two or more of the memory access requests are either conflicted or non-conflicted based on the requested address of each of the one or more memory access requests; if one or more of the memory access requests are non-conflicted, determining, by the address mutex arbiter, for each non-conflicted memory access request, whether the requested address of each non-conflicted memory access request is locked out by one or more prior memory access requests based on a lock table of the address mutex arbiter; if one or more of the non-conflicted memory access requests are locked-out by one or more prior memory access requests; queuing, by the address mutex arbiter, one or more locked-out memory requests; granting one or more non-conflicted memory access requests that are not locked-out; updating the lock table corresponding to the requested addresses associated with the one or more granted memory access requests. - View Dependent Claims (14, 15, 16, 17, 18)
-
-
19. A network processor comprising:
-
a plurality of processing modules, the processing modules coupled to at least one unidirectional ring bus, at least one level one (L1) cache, and at least one shared memory, wherein a source one of the processing modules is configured to send a task over the at least one unidirectional ring bus to an adjacent processing module coupled to the ring bus, the task having a corresponding one or more destination processing modules, and each processing module is configured to, iteratively, check, by the adjacent processing module, whether the processing module is a destination processing module for the task and, if not, passing the task unchanged to a next adjacent processing module coupled to the ring bus, thereby passing the task from the source processing module to each corresponding destination processing module on the at least one unidirectional ring bus, thereby defining a virtual pipeline of the network processor, the virtual pipeline defining a processing order of a task through two or more of the processing modules through the at least one unidirectional ring bus, wherein the defining accounts for non-sequential processing of the packet data; wherein the task includes shared parameters that point to a shared parameter table stored in a system memory, and task enqueue, scheduling and dequeue operations based on the shared parameters are executed while sending the task over the at least one unidirectional ring bus to an adjacent processing module coupled to the ring bus, wherein each destination processing module is configured to generate one or more memory access requests, each memory access request comprising a requested address and an ID value corresponding to the requesting processing module, wherein each memory access request comprises one of a locked access request and one or more simple access requests; an address mutually exclusive (mutex) arbiter configured to arbitrate access to the L1 cache, wherein the address mutex arbiter is further configured to; determine whether one or more received memory access requests are simple access requests or locked access requests, wherein, for each locked access request, the address mutex arbiter is configured to; determine whether two or more of the memory access requests are either conflicted or non-conflicted based on the requested address of each of the one or more memory access requests; if one or more of the memory access requests are non-conflicted; determine, for each non-conflicted memory access request, whether the requested address of each non-conflicted memory access request is locked out by one or more prior memory access requests based on a lock table of the address mutex arbiter; if one or more of the non-conflicted memory access requests are locked-out by one or more prior memory access requests;
queue the one or more locked-out memory requests;grant one or more non-conflicted memory access requests that are not locked-out; and update the lock table corresponding to the requested addresses associated with the one or more granted memory access requests, wherein the network processor is implemented as an integrated circuit. - View Dependent Claims (20)
-
Specification