System and method for event monitoring in cache coherence protocols without explicit invalidations
First Claim
Patent Images
1. A computer system comprising:
- multiple processor cores;
at least one local cache memory associated with and operatively coupled to each core for storing one or more cache lines accessible only by the associated core;
a shared memory, the shared memory being operatively coupled to the local cache memories and accessible by the cores, the shared memory being capable of storing a plurality of cache lines; and
a callback directory containing a set of callback (CB) bits associated with a memory address, wherein each CB bit in the set corresponds to a core;
wherein a core issuing a callback-read to the memory address either reads the last value written in the memory address, or is blocked from reading from the memory address until the next write takes place in the memory address and then reads a new value of said next write, such that the callback-read enables event monitoring for coherence of the at least one local cache and the shared memory without using explicit invalidations,when a CB bit corresponding to the core that issued the callback-read is set, the callback-read is completed by the core reading the last value in the memory address;
when the CB bit corresponding to the core that issued the callback-read is unset, the callback-read triggers setting of the CB bit and the callback-read is completed when the new value is written in the memory address and the new value is forwarded to the core that issued the callback-read; and
when the new value is written in the memory address, the new value is forwarded to all of the cores that have their corresponding CB bit set for the memory address, CB bits previously set for the memory address are cleared, and CB bits previously unset for the memory address are set.
1 Assignment
0 Petitions
Accused Products
Abstract
Synchronization events associated with cache coherence are monitored without using invalidations. A callback-read is issued to a memory address associated with the synchronization event, which callback-read either reads the last value written in the memory address or blocks until a next write takes place in the memory address and reads a newly written value.
-
Citations
14 Claims
-
1. A computer system comprising:
-
multiple processor cores; at least one local cache memory associated with and operatively coupled to each core for storing one or more cache lines accessible only by the associated core; a shared memory, the shared memory being operatively coupled to the local cache memories and accessible by the cores, the shared memory being capable of storing a plurality of cache lines; and a callback directory containing a set of callback (CB) bits associated with a memory address, wherein each CB bit in the set corresponds to a core; wherein a core issuing a callback-read to the memory address either reads the last value written in the memory address, or is blocked from reading from the memory address until the next write takes place in the memory address and then reads a new value of said next write, such that the callback-read enables event monitoring for coherence of the at least one local cache and the shared memory without using explicit invalidations, when a CB bit corresponding to the core that issued the callback-read is set, the callback-read is completed by the core reading the last value in the memory address; when the CB bit corresponding to the core that issued the callback-read is unset, the callback-read triggers setting of the CB bit and the callback-read is completed when the new value is written in the memory address and the new value is forwarded to the core that issued the callback-read; and when the new value is written in the memory address, the new value is forwarded to all of the cores that have their corresponding CB bit set for the memory address, CB bits previously set for the memory address are cleared, and CB bits previously unset for the memory address are set. - View Dependent Claims (2, 3, 4, 5, 12, 13)
-
-
6. A computer system comprising:
-
multiple processor cores; at least one local cache memory associated with and operatively coupled to each core for storing one or more cache lines accessible only by the associated core; a shared memory, the shared memory being operatively coupled to the local cache memories and accessible by the cores, the shared memory being capable of storing a plurality of cache lines; and a callback directory containing a set of callback (CB) bits and full/empty (F/E) bits associated with a memory address, wherein each CB bit and each F/E bit of the set corresponds to a different one of the cores; wherein a core issuing a callback-read to the memory address either reads the last value written in the memory address, or is blocked from reading from the memory address until the next write takes place in the memory address and then reads a new value of said next write, such that the callback-read enables event monitoring for coherence of the at least one local cache and the shared memory without using explicit invalidations when an F/E bit corresponding to the core that issued the callback-read is set, the callback-read from the core to the memory address with the set of CB and F/E bits is completed by the core reading the last value in the memory address; when the F/E bit corresponding to the core that issued the callback-read is unset, the callback-read triggers setting of a CB bit which corresponds to the memory address, and the call-back read is completed when the new value is written in the memory address and the new value is forwarded to the core that issued the callback-read; and when the new value is written in the memory address, the new value is forwarded to all of the cores that have a corresponding CB bit set for the memory address, corresponding CB bits previously set are cleared, and F/E bits that correspond to the CB bits previously unset are set. - View Dependent Claims (7, 8, 9, 10, 11, 14)
-
Specification