Prefetching irregular data references for software controlled caches
First Claim
1. A method, in a data processing system, for prefetching irregular memory references into a software controlled cache, the method comprising:
- receiving source code that is to be compiled;
analyzing the source code to identify at least one of a plurality of loops that contain an irregular memory reference;
determining if the irregular memory reference within the at least one of the plurality of loops is a candidate for optimization;
responsive to an indication that the irregular memory reference may be optimized, determining if the irregular memory reference is valid for prefetching;
responsive to an indication that the irregular memory reference is valid for prefetching, inserting a store statement for an address of the irregular memory reference into the at least one of the plurality of loops;
inserting a runtime library call into a prefetch runtime library for the irregular memory reference, wherein data associated with the irregular memory reference is prefetched into the software controlled cache when the runtime library call is invoked; and
wherein determining if the irregular memory reference within the at least one of the plurality of loops is the candidate for optimization comprises;
determining if a computed address accessed by the irregular memory reference is an affine function of a loop index variable;
responsive to the address failing to be the affine function of the loop index variable, determining if there is a loop-carried dependency between statements used in the computed address; and
responsive to failure to identify any loop-carried dependencies, indicating the irregular memory reference as the candidate for optimization.
1 Assignment
0 Petitions
Accused Products
Abstract
Prefetching irregular memory references into a software controlled cache is provided. A compiler analyzes source code to identify at least one of a plurality of loops that contain an irregular memory reference. The compiler determines if the irregular memory reference within the at least one loop is a candidate for optimization. Responsive to an indication that the irregular memory reference may be optimized, the compiler determines if the irregular memory reference is valid for prefetching. Responsive to an indication that the irregular memory reference is valid for prefetching, a store statement for an address of the irregular memory reference is inserted into the at least one loop. A runtime library call is inserted into a prefetch runtime library for the irregular memory reference. Data associated with the irregular memory reference is prefetched into the software controlled cache when the runtime library call is invoked.
45 Citations
17 Claims
-
1. A method, in a data processing system, for prefetching irregular memory references into a software controlled cache, the method comprising:
-
receiving source code that is to be compiled;
analyzing the source code to identify at least one of a plurality of loops that contain an irregular memory reference;determining if the irregular memory reference within the at least one of the plurality of loops is a candidate for optimization; responsive to an indication that the irregular memory reference may be optimized, determining if the irregular memory reference is valid for prefetching; responsive to an indication that the irregular memory reference is valid for prefetching, inserting a store statement for an address of the irregular memory reference into the at least one of the plurality of loops; inserting a runtime library call into a prefetch runtime library for the irregular memory reference, wherein data associated with the irregular memory reference is prefetched into the software controlled cache when the runtime library call is invoked; and wherein determining if the irregular memory reference within the at least one of the plurality of loops is the candidate for optimization comprises; determining if a computed address accessed by the irregular memory reference is an affine function of a loop index variable; responsive to the address failing to be the affine function of the loop index variable, determining if there is a loop-carried dependency between statements used in the computed address; and responsive to failure to identify any loop-carried dependencies, indicating the irregular memory reference as the candidate for optimization. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A computer program product comprising a non-transitory computer readable medium storing a computer readable program recorded thereon, wherein the computer readable program, when executed on a computing device, causes the computing device to:
-
receive source code that is to be compiled; analyze the source code to identify at least one of a plurality of loops that contain an irregular memory reference; determine if the irregular memory reference within the at least one of the plurality of loops is a candidate for optimization; responsive to an indication that the irregular memory reference may be optimized, determine if the irregular memory reference is valid for prefetching; responsive to an indication that the irregular memory reference is valid for prefetching, insert a store statement for an address of the irregular memory reference into the at least one of the plurality of loops; insert a runtime library call into a prefetch runtime library for the irregular memory reference, wherein data associated with the irregular memory reference is prefetched into the software controlled cache when the runtime library call is invoked; and wherein the computer readable program to determine if the irregular memory reference within the at least one of the plurality of loops is the candidate for optimization, wherein the computer readable program further causes the computing device to; determine if a computed address accessed by the irregular memory reference is an affine function of a loop index variable; responsive to the address failing to be the affine function of the loop index variable, determine if there is a loop-carried dependency between statements used in the computed address; and responsive to failure to identify any loop-carried dependencies, indicate the irregular memory reference as the candidate for optimization. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. An apparatus, comprising:
-
a processor; and a memory coupled to the processor, wherein the memory comprises instructions which, when executed by the processor, cause the processor to; receive source code that is to be compiled; analyze the source code to identify at least one of a plurality of loops that contain an irregular memory reference; determine if the irregular memory reference within the at least one of the plurality of loops is a candidate for optimization; responsive to an indication that the irregular memory reference may be optimized, determine if the irregular memory reference is valid for prefetching; responsive to an indication that the irregular memory reference is valid for prefetching, insert a store statement for an address of the irregular memory reference into the at least one of the plurality of loops; insert a runtime library call into a prefetch runtime library for the irregular memory reference, wherein data associated with the irregular memory reference is prefetched into the software controlled cache when the runtime library call is invoked; and wherein the instructions to determine if the irregular memory reference within the at least one of the plurality of loops is the candidate for optimization further cause the processor to; determine if a computed address accessed by the irregular memory reference is an affine function of a loop index variable; responsive to the address failing to be the affine function of the loop index variable, determine if there is a loop-carried dependency between statements used in the computed address; and responsive to failure to identify any loop-carried dependencies, indicate the irregular memory reference as the candidate for optimization. - View Dependent Claims (14, 15, 16, 17)
-
Specification