Empirically Based Dynamic Control of Transmission of Victim Cache Lateral Castouts
First Claim
1. A method of data processing in a data processing system including a plurality of processing units including a first processing unit and a second processing unit coupled by an interconnect fabric, wherein the first processing unit has a first processor core and associated first upper and first lower level caches and the second processing unit has a second processor core and associated second upper and lower level caches, said method comprising:
- in response to a data request, selecting a victim cache line to be castout from the first lower level cache;
selecting a target lower level cache of one of the plurality of processing units;
determining whether the selected target lower level cache has provided more than a threshold number of retry responses to lateral castout (LCO) commands of the first lower level cache, and if so, selecting a different target lower level cache;
the first processing unit thereafter issuing a LCO command on the interconnect fabric, wherein the LCO command identifies the victim cache line to be castout from the first lower level cache and indicates that the target lower level cache is an intended destination of the victim cache line; and
in response to a coherence response to the LCO command indicating success of the LCO command, removing the victim cache line from the first lower level cache and holding the victim cache line in the second lower level cache.
1 Assignment
0 Petitions
Accused Products
Abstract
In response to a data request, a victim cache line is selected for castout from a lower level cache, and a target lower level cache of one of the plurality of processing units is selected. A determination is made whether the selected target lower level cache has provided more than a threshold number of retry responses to lateral castout (LCO) commands of the first lower level cache, and if so, a different target lower level cache is selected. The first processing unit thereafter issues a LCO command on the interconnect fabric. The LCO command identifies the victim cache line to be castout and indicates that the target lower level cache is an intended destination of the victim cache line. In response to a successful coherence response to the LCO command, the victim cache line is removed from the first lower level cache and held in the second lower level cache.
-
Citations
44 Claims
-
1. A method of data processing in a data processing system including a plurality of processing units including a first processing unit and a second processing unit coupled by an interconnect fabric, wherein the first processing unit has a first processor core and associated first upper and first lower level caches and the second processing unit has a second processor core and associated second upper and lower level caches, said method comprising:
-
in response to a data request, selecting a victim cache line to be castout from the first lower level cache; selecting a target lower level cache of one of the plurality of processing units; determining whether the selected target lower level cache has provided more than a threshold number of retry responses to lateral castout (LCO) commands of the first lower level cache, and if so, selecting a different target lower level cache; the first processing unit thereafter issuing a LCO command on the interconnect fabric, wherein the LCO command identifies the victim cache line to be castout from the first lower level cache and indicates that the target lower level cache is an intended destination of the victim cache line; and in response to a coherence response to the LCO command indicating success of the LCO command, removing the victim cache line from the first lower level cache and holding the victim cache line in the second lower level cache. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 14, 15, 16)
-
-
9. A data processing system, comprising:
-
an interconnect fabric; a plurality of processing units coupled to the interconnect fabric, the plurality of processing units including a first processing unit and a second processing unit, wherein the first processing unit has a first processor core and associated first upper and first lower level caches, and wherein the second processing unit has a second processor core and associated second upper and lower level caches; and a home system memory coupled to the interconnect fabric, wherein the home system memory is assigned a plurality of addresses including an address; wherein the first processing unit, in response to a data request, selects a victim cache line associated with the address to be castout from the first lower level cache, selects a target lower level cache of one of the plurality of processing units, determines whether the selected target lower level cache has provided more than a threshold number of retry responses to lateral castout (LCO) commands of the first lower level cache, and if so, selects a different target lower level cache, and thereafter issues a LCO command on the interconnect fabric, the LCO command identifying the victim cache line to be castout from the first lower level cache and indicating that the target lower level cache is an intended destination of the victim cache line; and wherein responsive to a coherence response to the LCO command indicating success of the LCO command, the first processing unit removes the victim cache line from the first lower level cache and the second lower level cache holds the victim cache line. - View Dependent Claims (10, 11, 12, 13)
-
-
17. A processing unit for a data processing system including a home system memory and a plurality of processing units coupled by an interconnect fabric, wherein the home system memory is assigned a plurality of addresses including an address, the processing unit comprising:
-
the processing unit has a first processor core and associated first upper and first lower level caches; wherein the processing unit, in response to a data request, selects a victim cache line associated with the address to be castout from the first lower level cache, selects a target lower level cache of one of the plurality of processing units, determines whether the selected target lower level cache has provided more than a threshold number of retry responses to lateral castout (LCO) commands of the first lower level cache, and if so, selects a different target lower level cache, and thereafter issues a lateral castout (LCO) command on the interconnect fabric, the LCO command identifying the victim cache line to be castout from the first lower level cache and indicating that the target lower level cache is an intended destination of the victim cache line; and wherein responsive to a coherence response to the LCO command indicating success of the LCO command, the first processing unit removes the victim cache line from the first lower level cache for storage in a second lower level cache. - View Dependent Claims (18, 19, 20, 21, 22, 23)
-
-
24. A method of data processing in a data processing system including a plurality of processing units and a system memory coupled by an interconnect fabric, wherein each of the plurality of processing units includes a processor core and associated upper and lower level caches, said method comprising:
-
in response to a data request of the processor core of a first processing unit among the plurality of processing units, selecting a victim cache line to be removed from the lower level cache of the first processing unit; selecting among a lateral castout (LCO) from the lower level cache of the first processing unit to another lower level cache and a castout (CO) from the lower level cache of the first processing unit to the system memory, wherein the selecting comprises selecting based upon empirical success of the upper level cache of the first processing unit in obtaining requested data by intervention from cache memories of other of the plurality of processing units rather than the system memory; in response to selecting an LCO, performing an LCO of the victim cache line to a lower level cache of one of the plurality of processing units by issuing a LCO command on the interconnect fabric; and in response to selecting a CO, performing a CO of the victim cache line to the system memory. - View Dependent Claims (25, 26, 27, 28, 29, 30)
-
-
31. A data processing system, comprising:
-
a plurality of processing units and a system memory coupled by an interconnect fabric, wherein each of the plurality of processing units includes a processor core and associated upper and lower level caches, said plurality of processing units including a first processing unit; wherein the lower level cache of the first processing unit, responsive to a data request of the processor core of a first processing unit, selects a victim cache line to be removed from the lower level cache of the first processing unit and selects among a lateral castout (LCO) to another lower level cache and a castout (CO) to the system memory, wherein the lower level cache of the first processing unit selects either the LCO or CO based upon empirical success of the upper level cache of the first processing unit in obtaining requested data by intervention from cache memories of other of the plurality of processing units rather than the system memory; and wherein the lower level cache of the first processing unit, responsive to selecting an LCO, performs an LCO of the victim cache line to a lower level cache of one of the plurality of processing units by issuing a LCO command on the interconnect fabric and, responsive to selecting a CO, performs a CO of the victim cache line to the system memory. - View Dependent Claims (32, 33, 34, 35, 36, 37)
-
-
38. A processing unit for a data processing system having a plurality of processing units and a system memory coupled by an interconnect fabric, wherein each of the plurality of processing units includes a processor core and associated upper and lower level caches, said processing unit comprising:
-
a processor core; an upper level cache coupled to the processor core; and a lower level cache coupled to the upper level cache; wherein the lower level cache, responsive to a data request of the processor core, selects a victim cache line to be removed from the lower level cache and selects among a lateral castout (LCO) to another lower level cache and a castout (CO) to the system memory, wherein the lower level cache selects either the LCO or CO based upon empirical success of the upper level cache in obtaining requested data by intervention from cache memories of other of the plurality of processing units rather than the system memory; and wherein the lower level cache, responsive to selecting an LCO, performs an LCO of the victim cache line to a lower level cache of one of the plurality of processing units by issuing a LCO command on the interconnect fabric and, responsive to selecting a CO, performs a CO of the victim cache line to the system memory. - View Dependent Claims (39, 40, 41, 42, 43, 44)
-
Specification