System and method for limited fanout daisy chaining of cache invalidation requests in a shared-memory multiprocessor system
First Claim
1. A multiprocessor computer system, comprising:
- a plurality of nodes, each node including;
an interface to a local memory subsystem, the local memory subsystem storing a multiplicity of memory lines of information and a directory;
a memory cache for caching a multiplicity of memory lines of information, including memory lines of information stored in a remote memory subsystem that is local to another node;
the directory including an entry associated with a memory line of information stored in the local memory subsystem, the entry including an identification field for identifying a subset of nodes from the plurality of nodes caching the memory line of information;
the identification field configured to comprise a plurality of bits at associated positions within the identification field;
a protocol engine implementing a cache coherence protocol, said protocol engine configured to associate with each respective bit of the identification field one or more nodes of the plurality of nodes, including a respective first node, wherein the one or more nodes associated with each respective bit are determined by reference to the position of the respective bit within the identification field;
set each bit in the identification field of the directory entry associated with the memory line for which the memory line is cached in at least one of the associated nodes;
send an initial invalidation request to no more than a first predefined number of the nodes associated with set bits in the identification field of the directory entry associated with the memory line.
2 Assignments
0 Petitions
Accused Products
Abstract
A protocol engine is for use in each node of a computer system having a plurality of nodes. Each node includes an interface to a local memory subsystem that stores memory lines of information, a directory, and a memory cache. The directory includes an entry associated with a memory line of information stored in the local memory subsystem. The directory entry includes an identification field for identifying sharer nodes that potentially cache the memory line of information. The identification field has a plurality of bits at associated positions within the identification field. Each respective bit of the identification field is associated with one or more nodes. The protocol engine furthermore sets each bit in the identification field for which the memory line is cached in at least one of the associated nodes. In response to a request for exclusive ownership of a memory line, the protocol engine sends an initial invalidation request to no more than a first predefined number of the nodes associated with set bits in the identification field of the directory entry associated with the memory line.
-
Citations
21 Claims
-
1. A multiprocessor computer system, comprising:
a plurality of nodes, each node including;
an interface to a local memory subsystem, the local memory subsystem storing a multiplicity of memory lines of information and a directory;
a memory cache for caching a multiplicity of memory lines of information, including memory lines of information stored in a remote memory subsystem that is local to another node;
the directory including an entry associated with a memory line of information stored in the local memory subsystem, the entry including an identification field for identifying a subset of nodes from the plurality of nodes caching the memory line of information;
the identification field configured to comprise a plurality of bits at associated positions within the identification field;
a protocol engine implementing a cache coherence protocol, said protocol engine configured to associate with each respective bit of the identification field one or more nodes of the plurality of nodes, including a respective first node, wherein the one or more nodes associated with each respective bit are determined by reference to the position of the respective bit within the identification field;
set each bit in the identification field of the directory entry associated with the memory line for which the memory line is cached in at least one of the associated nodes;
send an initial invalidation request to no more than a first predefined number of the nodes associated with set bits in the identification field of the directory entry associated with the memory line. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
19. A protocol engine implementing a cache coherence protocol, for use in a multiprocessor computer system, the protocol engine located at a particular node of a plurality of nodes in the multiprocessor computer system, the protocol engine comprising:
-
input logic for receiving a first invalidation request, the invalidation request identifying a memory line of information and including a pattern of bits for identifying a subset of the plurality of nodes that potentially store cached copies of the identified memory line; and
processing circuitry, responsive to receipt of the first invalidation request, for sending a second invalidation request corresponding to the first invalidation request to a next node if the plurality of bits in fact identify the next node;
sending an invalidation acknowledgment to a requesting node identified in the first invalidation message if the plurality of bits fail to identify a next node; and
invalidating a cached copy of the identified memory line, if any, in the particular node of the plurality of nodes in the multiprocessor computer system.
-
-
20. A protocol engine implementing a cache coherence protocol, for use in a multiprocessor computer system, the protocol engine located at a particular node of a plurality of nodes in the multiprocessor computer system, the protocol engine comprising:
-
input logic for receiving a first invalidation request, the invalidation request identifying a memory line of information and including a pattern of bits for identifying a subset of the plurality of nodes that potentially store cached copies of the identified memory line; and
processing circuitry, responsive to receipt of the first invalidation request, for determining a next node identified by the pattern of bits in the invalidation request and for sending to the next node, if any, a second invalidation request corresponding to the first invalidation request, and for invalidating a cached copy of the identified memory line, if any, in the particular node of the multiprocessor computer system. - View Dependent Claims (21)
-
Specification