Vector width-aware synchronization-elision for vector processors
First Claim
Patent Images
1. A non-transitory computer-readable storage medium storing program instructions executable by a computer to implement a compiler configured to:
- receive program code that specifies a barrier operation, wherein the barrier operation synchronizes execution of a plurality of work items, by pausing execution of one of the plurality of work items until others of the plurality of work items reach a same execution point;
create a width-specific executable version of the program code, wherein creating the width-specific executable version comprises;
determining a vector width of a target computer system, wherein the vector width is a number of work items that the target computer system is configured to execute in parallel; and
in response to the determined vector width meeting one or more criteria, omitting the barrier operation from the width-specific executable version.
2 Assignments
0 Petitions
Accused Products
Abstract
A medium, method, and apparatus are disclosed for eliding superfluous function invocations in a vector-processing environment. A compiler receives program code comprising a width-contingent invocation of a function. The compiler creates a width-specific executable version of the program code by determining a vector width of a target computer system and omitting the function from the width-specific executable if the vector width meets one or more criteria. For example, the compiler may omit the function call if the vector width is greater than a minimum size.
-
Citations
16 Claims
-
1. A non-transitory computer-readable storage medium storing program instructions executable by a computer to implement a compiler configured to:
-
receive program code that specifies a barrier operation, wherein the barrier operation synchronizes execution of a plurality of work items, by pausing execution of one of the plurality of work items until others of the plurality of work items reach a same execution point; create a width-specific executable version of the program code, wherein creating the width-specific executable version comprises; determining a vector width of a target computer system, wherein the vector width is a number of work items that the target computer system is configured to execute in parallel; and in response to the determined vector width meeting one or more criteria, omitting the barrier operation from the width-specific executable version. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer-implemented method of compiling program code, the method comprising:
-
analyzing program code that includes a function call for a synchronization operation of a plurality of work items, wherein the synchronization operation halts execution for ones of the plurality of work items until others of the plurality of work items reach a same point of execution; producing a width-specific executable version of the program code, wherein producing the width-specific executable version comprises; in response to a vector width of a target computer system satisfying a threshold indicated, eliding the function call from the width-specific executable version, wherein the vector width is a number of work items that the target computer system is configured to execute concurrently. - View Dependent Claims (9, 10, 11, 12)
-
-
13. An apparatus, comprising:
-
a memory having a computer program stored therein; a vector processor configured to elide a function call in the computer program in response to determining that the function call is superfluous, wherein determining that the function call is superfluous is based on a vector width of the vector processor, wherein the vector width is a number of work items that the vector processor is configured to execute in parallel, and wherein the function call is for a barrier operation to synchronize execution of a plurality of work items, by suspending execution of one of the plurality of work items until others of the plurality of work items reach a similar point in execution. - View Dependent Claims (14, 15, 16)
-
Specification