Dual clusters of fully connected integrated circuit multiprocessors with shared high-level cache
First Claim
1. A system comprising:
- a drawer comprising a plurality of clusters, each of the plurality of clusters comprising a plurality of processors, each of the plurality of processors comprising a plurality of processing cores, each processing core of the plurality of processing cores having a private Level 1 cache and a private Level 2 cache per processing core, the plurality of processing cores within each of the plurality of clusters sharing a shared Level 3 cache; and
a single shared cache integrated circuit comprising a shared Level 4 cache memory and a directory to store a plurality of directory state bits, the single shared cache integrated circuit to manage the shared Level 4 cache memory among the plurality of clusters, wherein the single shared cache integrated circuit is configured to store computer readable instructions and execute the computer readable instructions for performing a method, the method comprising;
receiving, by the single shared cache integrated circuit, an operation of one of a plurality of operation types from one of the plurality of processors, andprocessing, by the single shared cache integrated circuit, the operation based at least in part on the operation type of the operation according to a set of rules for processing the operation type, wherein the directory state bits are used to process the operation according to the set of rules for processing the operation type,wherein the operation type is an input/output store type operation, and wherein the set of rules for processing the input/output store type operation comprise;
responsive to receiving the input/output store type operation by a requester processor of the plurality of processors, managing, by a target memory chip in a first cluster of the plurality of clusters, an input/output store sequence to memory responsive to a cache line not existing in a global intervention master state of the directory state bits;
selectively initiating, by the single shared cache integrated circuit, a first command broadcast to the first cluster, the selectively initiating being based on a determination that a target cache line hits any of the Level 3 caches in the first cluster, the first command broadcast being a first invalidate command sent to the first cluster;
selectively initiating, by the single shared cache integrated circuit, a second command broadcast to a second cluster of the plurality of clusters, the selectively initiating being based on a determination that a target cache line hits any of the Level 3 caches in the second cluster, the second command broadcast being a second invalidate command sent to the second cluster; and
updating the directory.
1 Assignment
0 Petitions
Accused Products
Abstract
Embodiments of the present invention are directed to managing a shared high-level cache for dual clusters of fully connected integrated circuit multiprocessors. An example of a computer-implemented method includes: providing a drawer comprising a plurality of clusters, each of the plurality of clusters comprising a plurality of processors; providing a shared cache integrated circuit to manage a shared cache memory among the plurality of clusters; receiving, by the shared cache integrated circuit, an operation of one of a plurality of operation types from one of the plurality of processors; and processing, by the shared cache integrated circuit, the operation based at least in part on the operation type of the operation according to a set of rules for processing the operation type.
28 Citations
8 Claims
-
1. A system comprising:
-
a drawer comprising a plurality of clusters, each of the plurality of clusters comprising a plurality of processors, each of the plurality of processors comprising a plurality of processing cores, each processing core of the plurality of processing cores having a private Level 1 cache and a private Level 2 cache per processing core, the plurality of processing cores within each of the plurality of clusters sharing a shared Level 3 cache; and a single shared cache integrated circuit comprising a shared Level 4 cache memory and a directory to store a plurality of directory state bits, the single shared cache integrated circuit to manage the shared Level 4 cache memory among the plurality of clusters, wherein the single shared cache integrated circuit is configured to store computer readable instructions and execute the computer readable instructions for performing a method, the method comprising; receiving, by the single shared cache integrated circuit, an operation of one of a plurality of operation types from one of the plurality of processors, and processing, by the single shared cache integrated circuit, the operation based at least in part on the operation type of the operation according to a set of rules for processing the operation type, wherein the directory state bits are used to process the operation according to the set of rules for processing the operation type, wherein the operation type is an input/output store type operation, and wherein the set of rules for processing the input/output store type operation comprise; responsive to receiving the input/output store type operation by a requester processor of the plurality of processors, managing, by a target memory chip in a first cluster of the plurality of clusters, an input/output store sequence to memory responsive to a cache line not existing in a global intervention master state of the directory state bits; selectively initiating, by the single shared cache integrated circuit, a first command broadcast to the first cluster, the selectively initiating being based on a determination that a target cache line hits any of the Level 3 caches in the first cluster, the first command broadcast being a first invalidate command sent to the first cluster; selectively initiating, by the single shared cache integrated circuit, a second command broadcast to a second cluster of the plurality of clusters, the selectively initiating being based on a determination that a target cache line hits any of the Level 3 caches in the second cluster, the second command broadcast being a second invalidate command sent to the second cluster; and updating the directory. - View Dependent Claims (2, 3, 4)
-
-
5. A system comprising:
-
a drawer comprising a plurality of clusters, each of the plurality of clusters comprising a plurality of processors, each of the plurality of processors comprising a plurality of processing cores, each processing core of the plurality of processing cores having a private Level 1 cache and a private Level 2 cache per processing core, the plurality of processing cores within each of the plurality of clusters sharing a shared Level 3 cache; and a single shared cache integrated circuit comprising a shared Level 4 cache memory and a directory to store a plurality of directory state bits, the single shared cache integrated circuit to manage the shared Level 4 cache memory among the plurality of clusters, wherein the single shared cache integrated circuit is configured to store computer readable instructions and execute the computer readable instructions for performing a method, the method comprising; receiving, by the single shared cache integrated circuit, an operation of one of a plurality of operation types from one of the plurality of processors, and processing, by the single shared cache integrated circuit, the operation based at least in part on the operation type of the operation according to a set of rules for processing the operation type, wherein the directory state bits are used to process the operation according to the set of rules for processing the operation type, wherein the operation type is a fetch type operation, and wherein the set of rules for processing the fetch type operation comprise; responsive to receiving the fetch type operation by a requester processor of the plurality of processors, providing, by a first cluster of the plurality of clusters, data; invalidating, by non-requester processors of the plurality of processors, their respective copies of the data responsive to the fetch type operation being a fetch exclusive command; selectively initiating, by the single shared cache integrated circuit, a command broadcast to a second cluster of the plurality of clusters; and updating the Level 4 cache memory and the directory. - View Dependent Claims (6, 7, 8)
-
Specification