Lateral cache-to-cache cast-in
First Claim
1. A method of data processing in a data processing system including a plurality of processing units including a first processing unit and a second processing unit coupled by an interconnect fabric, wherein the first processing unit has a first processor core and associated first upper and first lower level caches and the second processing unit has a second processor core and associated second upper and lower level caches, said method comprising:
- installing each of a plurality of cache lines in a congruence class of the first lower level cache and recording, in the first lower level cache, an access chronology for the plurality of cache lines and a respective membership of each of the plurality of cache lines in one of a plurality of classes including at least a first class and a second class, wherein recording a respective membership includes recording membership in the first class for one or more cache lines of the congruence class installed in response to a first access by the first processing unit that missed in the first lower level cache and recording a respective membership in the second class for one or more cache lines of the congruence class for which a subsequent second access by the first processing unit resulted in a hit in the first lower level cache;
thereafter, in response to a data request, selecting a victim cache line to be castout from the congruence class in the first lower level cache, wherein the selecting includes selecting the victim cache line from among one or more cache lines in the second class while excluding one or more cache lines in the first class from the selection;
the first processing unit issuing a lateral castout (LCO) command on the interconnect fabric, wherein the LCO command identifies the victim cache line to be castout from the first lower level cache and indicates that a lower level cache is an intended destination of the victim cache line; and
in response to a coherence response to the LCO command indicating success of the LCO command, removing the victim cache line from the first lower level cache and holding the victim cache line in the second lower level cache.
2 Assignments
0 Petitions
Accused Products
Abstract
A data processing system includes a first processing unit and a second processing unit coupled by an interconnect fabric. The first processing unit has a first processor core and associated first upper and first lower level caches, and the second processing unit has a second processor core and associated second upper and lower level caches. In response to a data request, a victim cache line is selected for castout from the first lower level cache. The first processing unit issues on the interconnect fabric a lateral castout (LCO) command that identifies the victim cache line to be castout from the first lower level cache and indicates that a lower level cache is an intended destination. In response to a coherence response indicating success of the LCO command, the victim cache line is removed from the first lower level cache and held in the second lower level cache.
125 Citations
23 Claims
-
1. A method of data processing in a data processing system including a plurality of processing units including a first processing unit and a second processing unit coupled by an interconnect fabric, wherein the first processing unit has a first processor core and associated first upper and first lower level caches and the second processing unit has a second processor core and associated second upper and lower level caches, said method comprising:
-
installing each of a plurality of cache lines in a congruence class of the first lower level cache and recording, in the first lower level cache, an access chronology for the plurality of cache lines and a respective membership of each of the plurality of cache lines in one of a plurality of classes including at least a first class and a second class, wherein recording a respective membership includes recording membership in the first class for one or more cache lines of the congruence class installed in response to a first access by the first processing unit that missed in the first lower level cache and recording a respective membership in the second class for one or more cache lines of the congruence class for which a subsequent second access by the first processing unit resulted in a hit in the first lower level cache; thereafter, in response to a data request, selecting a victim cache line to be castout from the congruence class in the first lower level cache, wherein the selecting includes selecting the victim cache line from among one or more cache lines in the second class while excluding one or more cache lines in the first class from the selection; the first processing unit issuing a lateral castout (LCO) command on the interconnect fabric, wherein the LCO command identifies the victim cache line to be castout from the first lower level cache and indicates that a lower level cache is an intended destination of the victim cache line; and in response to a coherence response to the LCO command indicating success of the LCO command, removing the victim cache line from the first lower level cache and holding the victim cache line in the second lower level cache. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A data processing system, comprising:
-
an interconnect fabric; and a plurality of processing units coupled to the interconnect fabric, the plurality of processing units including a first processing unit and a second processing unit, wherein the first processing unit has a first processor core and associated first upper and first lower level caches, and wherein the second processing unit has a second processor core and associated second upper and lower level caches; wherein the first lower level cache includes; the data array including a congruence class, wherein the congruence class includes a plurality of cache lines; a directory in the first lower level cache that holds an access chronology for the plurality of cache lines in the congruence class and that additionally holds a respective membership of each of the plurality of cache lines in one of a plurality of classes, wherein the plurality of classes includes at least a first class and a second class; a cache controller that records a respective membership for each of the plurality of cache lines in one of the plurality of classes, wherein the cache controller records membership in the first class for one or more cache lines of the congruence class installed in the congruence class in response to a first access by the first processing unit that missed in the first lower level cache and recording a respective membership in the second class for one or more cache lines of the congruence class for which a subsequent second access by the first processing unit resulted in a hit in the first lower level cache; wherein the first processing unit, in response to a data request, selects a victim cache line to be castout from the congruence class from among one or more cache lines in the second class while excluding one or more cache lines in the first class from the selection and issues a lateral castout (LCO) command on the interconnect fabric, the LCO command identifying the victim cache line to be castout from the first lower level cache and indicating that a lower level cache is an intended destination of the victim cache line; and wherein responsive to a coherence response to the LCO command indicating success of the LCO command, the first processing unit removes the victim cache line from the first lower level cache and the second lower level cache holds the victim cache line. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A processing unit for a data processing system including a plurality of processing units coupled by an interconnect fabric, the processing unit comprising:
-
the processing unit has a first processor core and associated first upper and first lower level caches; wherein the first lower level cache includes; the data array including a congruence class, wherein the congruence class includes a plurality of cache lines; a directory in the first lower level cache that records an access chronology for the plurality of cache lines in the congruence class and that records a respective membership of each of the plurality of cache lines in one of a plurality of classes, wherein the plurality of classes includes at least a first class and a second class; a cache controller that records a respective membership for each of the plurality of cache lines in one of the plurality of classes, wherein the cache controller records membership in the first class for one or more cache lines of the congruence class installed in the congruence class in response to a first access by the processing unit that missed in the first lower level cache and recording a respective membership in the second class for one or more cache lines of the congruence class for which a subsequent second access by the processing unit resulted in a hit in the first lower level cache; wherein the processing unit, in response to a data request, selects a victim cache line to be castout from the congruence class from among one or more cache lines in the second class while excluding one or more cache lines in the first class from the selection and issues a lateral castout (LCO) command on the interconnect fabric, the LCO command identifying the victim cache line to be castout from the first lower level cache and indicating that a lower level cache is an intended destination of the victim cache line; and wherein responsive to a coherence response to the LCO command indicating success of the LCO command, the first processing unit removes the victim cache line from the first lower level cache for storage in a second lower level cache. - View Dependent Claims (18, 19, 20, 21, 22, 23)
-
Specification