System and method for limited fanout daisy chaining of cache invalidation requests in a shared-memory multiprocessor system
First Claim
1. A multiprocessor computer system, comprising:
- a plurality of nodes, each node including;
an interface to a local memory subsystem, the local memory subsystem storing a multiplicity of memory lines of information and a directory;
a memory cache for caching a multiplicity of memory lines of information, including memory lines of information stored in a remote memory subsystem that is local to another node;
the directory including an entry associated with a memory line of information stored in the local memory subsystem, the entry including an identification field for identifying a subset of nodes from the plurality of nodes caching the memory line of information;
the identification field configured to comprise a plurality of bits at associated positions within the identification field;
a protocol engine implementing a cache coherence protocol, said protocol engine configure to associate with each respective bit of the identification field one or more nodes of the plurality of nodes, including a respective first node, wherein the one or more nodes associated with each respective bit are determined by the reference to the position of the respective bit within the identification field;
set each bit in the identification field of the directory entry associated with the memory line for which the memory line is cached in at least one of the associated nodes;
send an initial invalidation request to no more than a first predefined number of the nodes associated with set bits in the identification field of the directory entry associated with the memory line;
the identification field is subdivided to form a number of groups of bits equal to the first predefined number;
the protocol engine is configured to send at most one invalidation request for each group of bits, wherein the at most one invalidation request for each group of bits is sent to a first node, if any, associated with a set bit in the group of bits; and
the protocol engine also is configured to include in the initial invalidation request sent to the first node associated with one of the groups of bits in the identification field a pattern of bits based on the one group of bits in the identification field, such that a recipient node of the initial invalidation request can derive from the pattern of bits a next recipient node, if any, to which to send a second invalidation request corresponding to the initial invalidation request.
5 Assignments
0 Petitions
Accused Products
Abstract
A protocol engine is for use in each node of a computer system having a plurality of nodes. Each node includes an interface to a local memory subsystem that stores memory lines of information, a directory, and a memory cache. The directory includes an entry associated with a memory line of information stored in the local memory subsystem. The directory entry includes an identification field for identifying sharer nodes that potentially cache the memory line of information. The identification field has a plurality of bits at associated positions within the identification field. Each respective bit of the identification field is associated with one or more nodes. The protocol engine furthermore sets each bit in the identification field for which the memory line is cached in at least one of the associated nodes. In response to a request for exclusive ownership of a memory line, the protocol engine sends an initial invalidation request to no more than a first predefined number of the nodes associated with set bits in the identification field of the directory entry associated with the memory line.
110 Citations
3 Claims
-
1. A multiprocessor computer system, comprising:
-
a plurality of nodes, each node including;
an interface to a local memory subsystem, the local memory subsystem storing a multiplicity of memory lines of information and a directory;
a memory cache for caching a multiplicity of memory lines of information, including memory lines of information stored in a remote memory subsystem that is local to another node;
the directory including an entry associated with a memory line of information stored in the local memory subsystem, the entry including an identification field for identifying a subset of nodes from the plurality of nodes caching the memory line of information;
the identification field configured to comprise a plurality of bits at associated positions within the identification field;
a protocol engine implementing a cache coherence protocol, said protocol engine configure to associate with each respective bit of the identification field one or more nodes of the plurality of nodes, including a respective first node, wherein the one or more nodes associated with each respective bit are determined by the reference to the position of the respective bit within the identification field;
set each bit in the identification field of the directory entry associated with the memory line for which the memory line is cached in at least one of the associated nodes;
send an initial invalidation request to no more than a first predefined number of the nodes associated with set bits in the identification field of the directory entry associated with the memory line;
the identification field is subdivided to form a number of groups of bits equal to the first predefined number;
the protocol engine is configured to send at most one invalidation request for each group of bits, wherein the at most one invalidation request for each group of bits is sent to a first node, if any, associated with a set bit in the group of bits; and
the protocol engine also is configured to include in the initial invalidation request sent to the first node associated with one of the groups of bits in the identification field a pattern of bits based on the one group of bits in the identification field, such that a recipient node of the initial invalidation request can derive from the pattern of bits a next recipient node, if any, to which to send a second invalidation request corresponding to the initial invalidation request. - View Dependent Claims (2)
-
-
3. A multiprocessor computer system, comprising:
-
a plurality of nodes, each node including;
an interface to a local memory subsystem, the local memory subsystem storing a multiplicity of memory lines of information and a directory;
a memory cache for caching a multiplicity of memory lines of information, including memory lines of information stored in a remote memory subsystem that is local to another node;
the directory including an entry associated with a memory line of information stored in the local memory subsystem, the entry including an identification field for identifying a subset of nodes from the plurality of nodes caching the memory line of information;
the identification field configured to comprise a plurality of bits at associated positions within the identification field;
a protocol engine implementing a cache coherence protocol, said protocol engine configured to associate with each respective bit of the identification field one or more nodes of the plurality of nodes, including a respective first node, wherein the one or more nodes associated with each respective bit are determined by the reference to the position of the respective bit within the identification field;
set each bit in the identification field of the directory entry associated with the memory line for which the memory line is cached in at least one of the associated nodes;
send an initial invalidation request to no more than a first predefined number of the nodes associated with set bits in the identification field of the directory entry associated with the memory line;
the identification field is subdivided to form a number of groups of bits equal to the first predefined number;
the protocol engine is configured to send at most one invalidation request for each group of bits, wherein the at most one invalidation request for each group of bits is sent to a first node, if any, associated with a set bit in the group of bits; and
the protocol engine also configured send to a respective version of the initial invalidation request to the first node, if any, associated with each group of bits of the identification field, and to include in each respective version of the initial invalidation request a pattern of bits based on the respective group of bits in the identification field, such that each first node can derive a next recipient node, if any, from the pattern of bits in the respective version of the initial invalidation request received by the first node, wherein the next recipient node is to be sent a second invalidation request corresponding to the initial invalidation request.
-
Specification