Method of, system for, and computer program product for performing weighted loop fusion by an optimizing compiler
First Claim
1. A method of performing weighted loop fusion on a plurality of loop nests by a computer system, said method comprising the steps of:
- creating a loop dependence graph (LDG) representing the plurality of loop nests, wherein each of the plurality of loop nests is associated with a node of the LDG, and wherein an edge of the LDG represents a loop-independent data dependence from a source loop nest to a destination loop nest;
associating a weight wij =wji with each pair of nodes i and j, wherein the weight represents a cost savings obtained if loop nests i and j are fused;
determining a set, AN×
N, of directed summary arcs on N, wherein N is a set of nodes of the loop dependence graph, N={1, 2, . . . , n};
determining a set, C, of contractable arcs wherein C is a subset of the set of directed summary arcs, CA;
determining a set, NC, of noncontractable arcs wherein NC is a subset of the set of directed summary arcs, NC=A-Cdetermining a set, B, of unordered pairs of nodes (i,j) that have a wij >
0 and that are not connected by a summary arc, both (i,j).epsilon slash.A and (j,i).epsilon slash.A;
associating a (0,1)-variable xij with each unordered pair of nodes (i,j), and setting the xij =0 if and only if i and j are placed in a same cluster and setting xij =1 otherwise;
determining a set of multipliers π
i (i.di-elect cons.N) wherein each of the multipliers π
i are an integer between 0 and n=|N|, wherein π
j -π
i ≧
1 for all (i,j).di-elect cons.A, wherein π
i =π
j if xij =0, wherein π
j -π
i ≧
1 if xij =1 and (i,j).di-elect cons.A; and
minimizing a summation of a product of wij and xij over a set of nodes (i,j) comprising a union of the set B and the set of contractable arcs C, (i,j).di-elect cons.B∪
C, subject to a set of constraints comprising;
##EQU10## performing loop fusion on the plurality of loop nests using loop partitions based on the minimized summation.
1 Assignment
0 Petitions
Accused Products
Abstract
An integer programming formulation for weighted loop fusion is presented. Loop fusion is a well-known program transformation that has shown to be effective in reducing loop overhead and improving register and cache locality. Weighted loop fusion is the problem of finding a legal partition of loop nests into fusible clusters so as to minimize the total inter-cluster edge weights. Past work has shown that the weighted loop fusion problem is NP-hard. Despite the NP-hardness property, the present invention provides optimal solutions that may be found efficiently, in the context of an optimizing compiler, for weighted loop fusion problem sizes that occur in practice. An integer programming formulation for weighted loop fusion with a problem size (number of variables and constraints) that is linearly proportional to the size of the input weighted loop fusion problem is also presented. The integer programming formulation may be solved efficiently using a general integer programming package. Alternatively, a custom branch-and-bound procedure for the integer programming formulation is presented that is more efficient than the procedures used in general integer programming.
124 Citations
9 Claims
-
1. A method of performing weighted loop fusion on a plurality of loop nests by a computer system, said method comprising the steps of:
-
creating a loop dependence graph (LDG) representing the plurality of loop nests, wherein each of the plurality of loop nests is associated with a node of the LDG, and wherein an edge of the LDG represents a loop-independent data dependence from a source loop nest to a destination loop nest; associating a weight wij =wji with each pair of nodes i and j, wherein the weight represents a cost savings obtained if loop nests i and j are fused; determining a set, AN×
N, of directed summary arcs on N, wherein N is a set of nodes of the loop dependence graph, N={1, 2, . . . , n};determining a set, C, of contractable arcs wherein C is a subset of the set of directed summary arcs, CA; determining a set, NC, of noncontractable arcs wherein NC is a subset of the set of directed summary arcs, NC=A-C determining a set, B, of unordered pairs of nodes (i,j) that have a wij >
0 and that are not connected by a summary arc, both (i,j).epsilon slash.A and (j,i).epsilon slash.A;associating a (0,1)-variable xij with each unordered pair of nodes (i,j), and setting the xij =0 if and only if i and j are placed in a same cluster and setting xij =1 otherwise; determining a set of multipliers π
i (i.di-elect cons.N) wherein each of the multipliers π
i are an integer between 0 and n=|N|, wherein π
j -π
i ≧
1 for all (i,j).di-elect cons.A, wherein π
i =π
j if xij =0, wherein π
j -π
i ≧
1 if xij =1 and (i,j).di-elect cons.A; andminimizing a summation of a product of wij and xij over a set of nodes (i,j) comprising a union of the set B and the set of contractable arcs C, (i,j).di-elect cons.B∪
C, subject to a set of constraints comprising;
##EQU10## performing loop fusion on the plurality of loop nests using loop partitions based on the minimized summation. - View Dependent Claims (2, 3)
-
-
4. An article of manufacture for use in a computer system for performing weighted loop fusion on a plurality of loop nests by the computer system, said article of manufacture comprising a computer-readable storage medium having a computer program embodied in said medium which may cause the computer system to:
-
create a loop dependence graph (LDG) representing the plurality of loop nests, wherein each of the plurality of loop nests is associated with a node of the LDG, and wherein an edge of the LDG represents a loop-independent data dependence from a source loop nest to a destination loop nest; associate a weight wij =wji with each pair of nodes i and j, wherein the weight represents a cost savings obtained if loop nests i and j are fused; determine a set, AN×
N, of directed summary arcs on N, wherein N is a set of nodes of the loop dependence graph, N={1, 2, . . . , n};determine a set, C, of contractable arcs wherein C is a subset of the set of directed summary arcs, CA; determine a set, NC, of noncontractable arcs wherein NC is a subset of the set of directed summary arcs, NC=A-C determine a set, B, of unordered pairs of nodes (i,j) that have a wij >
0 and that are not connected by a summary arc, both (i,j).epsilon slash.A and (j,i).epsilon slash.A;associate a (0,1)-variable xij with each unordered pair of nodes (i,j), and setting the xij =0 if and only if i and j are placed in a same cluster and setting xij =1 otherwise; determine a set of multipliers π
i (i.di-elect cons.N) wherein each of the multipliers π
i are an integer between 0 and n=|N|, wherein π
j -π
i ≧
1 for all (i,j).di-elect cons.A, wherein π
i =π
j if xij =0, wherein π
j -π
i ≧
1 if xij =1 and (i,j).di-elect cons.A; andminimize a summation of a product of wij and xij over a set of nodes (i,j) comprising a union of the set B and the set of contractable arcs C, (i,j).di-elect cons.B∪
C, subject to a set of constraints comprising;
##EQU13## perform loop fusion on the plurality of loop nests using loop partitions based on the minimized summation. - View Dependent Claims (5, 6)
-
-
7. A computer system for performing weighted loop fusion on a plurality of loop nests by the computer system, said computer system comprising:
-
means for creating a loop dependence graph (LDG) representing the plurality of loop nests, wherein each of the plurality of loop nests is associated with a node of the LDG, and wherein an edge of the LDG represents a loop-independent data dependence from a source loop nest to a destination loop nest; means for associating a weight wij =wji with each pair of nodes i and j, wherein the weight represents a cost savings obtained if loop nests i and j are fused; means for determining a set, AN×
N, of directed summary arcs on N, wherein N is a set of nodes of the loop dependence graph, N={1, 2, . . . , n};means for determining a set, C, of contractable arcs wherein C is a subset of the set of directed summary arcs, CA; means for determining a set, NC, of noncontractable arcs wherein NC is a subset of the set of directed summary arcs, NC=A-C means for determining a set, B, of unordered pairs of nodes (i,j) that have a wij >
0 and that are not connected by a summary arc, both (i,j).epsilon slash.A and (j,i).epsilon slash.A;means for associating a (0,1)-variable xij with each unordered pair of nodes (i,j), where the xij =0 if and only if i and j are placed in a same cluster and xij =1 otherwise; means for determining a set of multipliers π
i (i.di-elect cons.N) wherein each of the multipliers π
i are an integer between 0 and n=|N|, wherein π
j -π
i ≧
1 for all (i,j).di-elect cons.A, wherein π
i =π
j if xij =0, wherein π
j -π
i ≧
1 if xij =1 and (i,j).di-elect cons.A; andmeans for minimizing a summation of a product of wij and xij over a set of nodes (i,j) comprising a union of the set B and the set of contractable arcs ##EQU16## means for performing loop fusion on the plurality of loop nests using loop partitions based on the minimized summation. - View Dependent Claims (8, 9)
-
Specification