Run-Time parallelization of loops in computer programs using bit vectors
First Claim
Patent Images
1. A method for executing, by a processor of a computer system, a set of program instructions for a loop, wherein the method comprises:
- associating a unique proxy value with each indirect loop index variable of the loop, wherein each unique proxy value is a proxy value bit vector, each proxy value bit vector representing a different prime number;
calculating, for each iteration of the loop, an indirectly indexed access pattern based upon the unique values;
determining whether cross-iteration dependencies exist between any two iterations of the loop based upon the indirectly indexed access patterns of the two iterations;
scheduling the program instructions of the loop across iterations into waves based on the cross-iteration dependencies found; and
executing the waves.
0 Assignments
0 Petitions
Accused Products
Abstract
Parallelization of loops is performed for loops having indirect loop index variables and embedded conditional statements in the loop body. Loops having any finite number of array variables in the loop body, and any finite number of indirect loop index variables can be parallelized. There are two particular limitations of the described techniques: (i) that there are no cross-iteration dependencies in the loop other than through the indirect loop index variables; and (ii) that the loop index variables (either direct or indirect) are not redefined in the loop body.
77 Citations
18 Claims
-
1. A method for executing, by a processor of a computer system, a set of program instructions for a loop, wherein the method comprises:
-
associating a unique proxy value with each indirect loop index variable of the loop, wherein each unique proxy value is a proxy value bit vector, each proxy value bit vector representing a different prime number; calculating, for each iteration of the loop, an indirectly indexed access pattern based upon the unique values; determining whether cross-iteration dependencies exist between any two iterations of the loop based upon the indirectly indexed access patterns of the two iterations; scheduling the program instructions of the loop across iterations into waves based on the cross-iteration dependencies found; and executing the waves. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A computer program product for executing, by a processor of a computer system, a set of program instructions for a loop, the computer program product comprising computer software stored on a tangible, computer-readable storage medium for performing:
-
associating a unique proxy value with each indirect loop index variable of the loop, wherein each unique proxy value is a proxy value bit vector, each proxy value bit vector representing a different prime number; calculating, for each iteration of the loop, an indirectly indexed access pattern based upon the unique values; determining whether cross-iteration dependencies exist between any two iterations of the loop based upon the indirectly indexed access patterns of the two iterations; scheduling the program instructions of the loop across iterations into waves based on the cross-iteration dependencies found; and executing the waves. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A computer system having program instructions stored on a computer-readable medium for executing, by a processor of the computer system, a set of the program instructions for a loop, wherein the executing comprises performing:
-
associating a unique proxy value with each indirect loop index variable of the loop, wherein each unique proxy value is a proxy value bit vector, each proxy value bit vector representing a different prime number; calculating, for each iteration of the loop, an indirectly indexed access pattern based upon the unique values; determining whether cross-iteration dependencies exist between any two iterations of the loop based upon the indirectly indexed access patterns of the two iterations; scheduling the program instructions of the loop across iterations into waves based on the cross-iteration dependencies found; and executing the waves. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification