Scheduling Workloads Based On Cache Asymmetry
First Claim
Patent Images
1. An apparatus comprising:
- a processor including a first cache and a second cache, a first core associated with the first cache and a second core associated with the second cache, the first and second caches asymmetric caches of a common cache level, wherein a scheduler is to schedule a plurality of threads to the first and second cores based at least in part on cache performance information obtained during a training phase of the plurality of threads on the asymmetric caches.
1 Assignment
0 Petitions
Accused Products
Abstract
In one embodiment, a processor includes a first cache and a second cache, a first core associated with the first cache and a second core associated with the second cache. The caches are of asymmetric sizes, and a scheduler can intelligently schedule threads to the cores based at least in part on awareness of this asymmetry and resulting cache performance information obtained during a training phase of at least one of the threads.
-
Citations
27 Claims
-
1. An apparatus comprising:
a processor including a first cache and a second cache, a first core associated with the first cache and a second core associated with the second cache, the first and second caches asymmetric caches of a common cache level, wherein a scheduler is to schedule a plurality of threads to the first and second cores based at least in part on cache performance information obtained during a training phase of the plurality of threads on the asymmetric caches. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
10. A method comprising:
-
initiating a thread on a first core of a processor; measuring cache performance of the thread on a first cache used by the thread and predicting cache performance of the thread on a second cache unused by the thread, the first and second caches of asymmetric size, and storing cache performance information regarding the cache performance in a storage; and determining a thread-to-core mapping based on the cache performance information. - View Dependent Claims (11, 12, 13, 14, 15)
-
-
16. A system comprising:
-
a processor including first, second, third and fourth cores, the first and second cores coupled to a first shared cache, the third and fourth cores coupled to a second shared cache, the first and second shared caches asymmetric caches of a common cache level, a first predictor to predict cache performance of a first thread on the second shared cache during a training phase of the first thread on the first core and to determine cache performance of the first thread on the first shared cache during the training phase of the first thread, and a second predictor to predict cache performance of a second thread on the first shared cache during a training phase of the second thread on the third core and to determine cache performance of the second thread on the second shared cache during the training phase of the second thread; and a dynamic random access memory (DRAM) coupled to the processor. - View Dependent Claims (17, 18, 19, 20, 21, 22)
-
-
23. An apparatus comprising:
-
first, second, third and fourth cores; a first shared cache coupled to the first and second cores; a second shared cache coupled to the third and fourth cores, the first and second shared caches asymmetric caches of a common cache level; a first predictor to predict cache performance of a first thread on the second shared cache during a training phase of the first thread on the first core; and a second predictor to predict cache performance of a second thread on the first shared cache during a training phase of the second thread on the third core. - View Dependent Claims (24, 25, 26, 27)
-
Specification