Method and apparatus for performing prefetching at the critical section level
First Claim
1. A method for compiling source code into executable code that performs prefetching for memory operations within critical sections of code that are subject to mutual exclusion, comprising:
- compiling a source code module containing programming language instructions into an executable code module containing instructions suitable for execution by a processor;
identifying a critical section within the executable code module by identifying a region of code between a mutual exclusion lock operation and a mutual exclusion unlock operation wherein identifying the critical section of code involves;
using a first macro to perform the mutual exclusion lock operation, wherein the first macro additionally activates prefetching; and
, using a second macro to perform the mutual exclusion unlock operation, wherein the second macro additionally deactivates prefetching, and wherein the second macro does not deactivate prefetching if the mutual exclusion unlock operation is nested within another critical section bounded by an additional mutual exclusion lock operation and an additional mutual exclusion unlock operation; and
scheduling explicit prefetch instructions into the executable code module in advance of associated memory operations located within the critical section, so that prefetch operations are performed for memory operations within the critical section.
2 Assignments
0 Petitions
Accused Products
Abstract
One embodiment of the present invention provides a system for compiling source code into executable code that performs prefetching for memory operations within critical sections of code that are subject to mutual exclusion. The system operates by compiling a source code module containing programming language instructions into an executable code module containing instructions suitable for execution by a processor. Next, the system identifies a critical section within the executable code module by identifying a region of code between a mutual exclusion lock operation and a mutual exclusion unlock operation. The system schedules explicit prefetch instructions into the critical section in advance of associated memory operations. In one embodiment, the system identifies the critical section of code by using a first macro to perform the mutual exclusion lock operation, wherein the first macro additionally activates prefetching. The system also uses a second macro to perform the mutual exclusion unlock operation, wherein the second macro additionally deactivates prefetching.
32 Citations
17 Claims
-
1. A method for compiling source code into executable code that performs prefetching for memory operations within critical sections of code that are subject to mutual exclusion, comprising:
-
compiling a source code module containing programming language instructions into an executable code module containing instructions suitable for execution by a processor;
identifying a critical section within the executable code module by identifying a region of code between a mutual exclusion lock operation and a mutual exclusion unlock operation wherein identifying the critical section of code involves;
using a first macro to perform the mutual exclusion lock operation, wherein the first macro additionally activates prefetching; and
,using a second macro to perform the mutual exclusion unlock operation, wherein the second macro additionally deactivates prefetching, and wherein the second macro does not deactivate prefetching if the mutual exclusion unlock operation is nested within another critical section bounded by an additional mutual exclusion lock operation and an additional mutual exclusion unlock operation; and
scheduling explicit prefetch instructions into the executable code module in advance of associated memory operations located within the critical section, so that prefetch operations are performed for memory operations within the critical section. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
checking for an unmatched second macro that deactivates prefetching and is not preceded by a matching first macro that activates prefetching; and
if an unmatched second macro is encountered, signaling an error condition.
-
-
3. The method of claim 1, wherein the mutual exclusion lock operation is implemented using one of, a spin lock, a semaphore, a read-writer lock, a turnstile, a mutex lock and an adaptive mutex lock.
-
4. The method of claim 1, further comprising:
-
identifying functions containing memory operations that tend to generate a large number of cache misses by, running the executable code module on the processor in a training mode on a representative workload, keeping statistics on cache miss rates for memory operations within functions within the executable code module, and identifying a set of functions that generate the large number of cache misses; and
scheduling explicit prefetch instructions into the executable code module in advance of associated memory operations within the identified set of functions, so that prefetch operations are performed for memory operations within the set of functions that generate the large number of cache misses.
-
-
5. The method of claim 1, wherein scheduling explicit prefetch instructions into the executable code module further comprises:
-
identifying a subset of memory operations of a particular type within the critical section; and
scheduling explicit prefetch operations for memory operations belonging to the subset.
-
-
6. The method of claim 5, wherein the particular type of memory operation includes, but is not limited to, one of,
memory operations through pointers; -
memory operations involving static data;
memory operations from locations that have not been previously accessed;
memory operations outside a system stack; and
memory operations that are likely to be executed.
-
-
7. The method of claim 1, wherein scheduling explicit prefetch instructions into the executable code module further comprises:
-
identifying a subset of prefetch operations with a particular property that are associated with memory operations within the critical section; and
scheduling explicit prefetch operations for prefetch operations belonging to the subset based on properties of the subset.
-
-
8. The method of claim 7, wherein the particular property of the subset of prefetch operations includes, but is not limited to, one of,
existence of an available issue slot for the prefetch operation; -
being located on the same side of a function call site from an associated memory operation;
being located on an opposite side of a function call site from an associated memory operation; and
being associated with a cache block that is not already subject to a scheduled prefetch operation.
-
-
9. A computer readable storage medium storing instructions that when executed by a computer cause the computer to perform a method for compiling source code into executable code that performs prefetching for memory operations within critical sections of code that are subject to mutual exclusion, comprising:
-
compiling a source code module containing programming language instructions into an executable code module containing instructions suitable for execution by a processor;
identifying a critical section within the executable code module by identifying a region of code between a mutual exclusion lock operation and a mutual exclusion unlock operation wherein identifying the critical section of code involves;
using a first macro to perform the mutual exclusion lock operation, wherein the first macro additionally activates prefetching; and
,using a second macro to perform the mutual exclusion unlock operation, wherein the second macro additionally deactivates prefetching, and wherein the second macro does not deactivate prefetching if the mutual exclusion unlock operation is nested within another critical section bounded by an additional mutual exclusion lock operation and an additional mutual exclusion unlock operation; and
scheduling explicit prefetch instructions into the executable code module in advance of associated memory operations located within the critical section, so that prefetch operations are performed for memory operations within the critical section.
-
-
10. An apparatus that compiles source code into executable code that performs prefetching for memory operations within critical sections of code that are subject to mutual exclusion, comprising:
-
a compiling mechanism that compiles a source code module containing programming language instructions into an executable code module containing instructions suitable for execution by a processor;
an identification mechanism that identifies a critical section within the executable code module by identifying a region of code between a mutual exclusion lock operation and a mutual exclusion unlock operation, wherein the identification mechanism is further configured to;
use a first macro to perform the mutual exclusion lock operation, wherein the first macro additionally activates prefetching; and
,use a second macro to perform the mutual exclusion unlock operation, wherein the second macro additionally deactivates prefetching, and wherein the second macro does not deactivate prefetching if the mutual exclusion unlock operation is nested within another critical section bounded by an additional mutual exclusion lock operation and an additional mutual exclusion unlock operation; and
an scheduling mechanism that schedules explicit prefetch instructions into the executable code module in advance of associated memory operations located within the critical section, so that prefetch operations are performed for memory operations within the critical section. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17)
check for an unmatched second macro that deactivates prefetching and is not preceded by a matching first macro that activates prefetching; and
tosignal an error condition if an unmatched second macro is encountered.
-
-
12. The apparatus of claim 10, wherein the mutual exclusion lock operation is implemented using one of, a spin lock, a semaphore, a read-writer lock, a turnstile, a mutex lock and an adaptive mutex lock.
-
13. The apparatus of claim 10, wherein the identification mechanism is further configured to:
-
identify functions containing memory operations that tend to generate a large number of cache misses by, running the executable code module on the processor in a training mode on a representative workload, keeping statistics on cache miss rates for memory operations within functions within the executable code module, and identifying a set of functions that generate the large number of cache misses; and
toschedule explicit prefetch instructions into the executable code module in advance of associated memory operations within the identified set of functions, so that prefetch operations are performed for memory operations within the set of functions that generate the large number of cache misses.
-
-
14. The apparatus of claim 10, wherein the scheduling mechanism is further configured to:
-
identify a subset of memory operations of a particular type within the identified set of functions; and
toschedule explicit prefetch operations for memory operations belonging to the subset.
-
-
15. The apparatus of claim 14, wherein the particular type of memory operation includes, but is not limited to, one of,
memory operations through pointers; -
memory operations involving static data;
memory operations from locations that have not been previously accessed;
memory operations outside a system stack; and
memory operations that are likely to be executed.
-
-
16. The apparatus of claim 10, wherein the scheduling mechanism is further configured to:
-
identify a subset of prefetch operations of with a particular property that are associated with memory operations within the identified set of functions; and
toschedule explicit prefetch operations for prefetch operations belonging to the subset based on properties of the subset.
-
-
17. The apparatus of claim 16, wherein the particular property of the subset of prefetch operations includes, but is not limited to, one of,
existence of an available issue slot for the prefetch operation; -
being located on the same side of a function call site from an associated memory operation;
being located on an opposite side of a function call site from an associated memory operation; and
being associated with a cache block that is not already subject to a scheduled prefetch operation.
-
Specification