System and method for managing processor-in-memory (PIM) operations
First Claim
1. A method performed by a computer system during compilation of program code for vectorizing an iterative loop of the program code for execution by a vector computer system having a plurality of processors connected to memory, wherein the memory includes one or more vector atomic memory operation (AMO) functional units and the processors include one or more vector functional units, the method comprising:
- scanning the program code, wherein scanning includes determining whether an operation of the iterative loop is vectorizable;
if an operation is vectorizable, determining whether the operation should be executed using a vector AMO instruction in one of the vector AMO functional units;
if an operation is vectorizable and the operation should be executed using a vector AMO instruction in one of the vector AMO functional units, compiling at least a portion of the operation into a vector AMO instruction; and
if an operation is vectorizable and the operation should not be executed using a vector AMO instruction in one of the vector AMO functional units, compiling at least a portion of the operation to execute in one or more vector functional units of one or more processors.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method of compiling program code, wherein the program code includes an operation on an array of data elements stored in memory of a computer system. The program code is scanned for operations that are vectorizable. The vectorizable operations are examined to determine whether they should be executed at least in part in a vector atomic memory operation (AMO) functional unit attached to memory. If so, the compiled code includes vector AMO instructions.
30 Citations
18 Claims
-
1. A method performed by a computer system during compilation of program code for vectorizing an iterative loop of the program code for execution by a vector computer system having a plurality of processors connected to memory, wherein the memory includes one or more vector atomic memory operation (AMO) functional units and the processors include one or more vector functional units, the method comprising:
-
scanning the program code, wherein scanning includes determining whether an operation of the iterative loop is vectorizable; if an operation is vectorizable, determining whether the operation should be executed using a vector AMO instruction in one of the vector AMO functional units; if an operation is vectorizable and the operation should be executed using a vector AMO instruction in one of the vector AMO functional units, compiling at least a portion of the operation into a vector AMO instruction; and if an operation is vectorizable and the operation should not be executed using a vector AMO instruction in one of the vector AMO functional units, compiling at least a portion of the operation to execute in one or more vector functional units of one or more processors. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A computer implemented method of compiling program code for execution by a vector computer system having a plurality of processors connected to memory, wherein the memory includes one or more vector atomic memory operation (AMO) functional units and the processors include one or more vector functional units, the method comprising:
-
a) scanning the program code for an operation that is vectorizable; b) determining whether some portion of the vectorizable equation should be executed in the vector AMO functional unit; and c) generating compiled code to replace the equation with vectorized machine executable code; wherein, if a determination was made that some portion of the vectorizable equation should be executed in the vector AMO functional unit, the vectorized machine executable code includes vectorization code for performing a mathematical operation using one or more vector atomic memory operations; and wherein, if a determination was made that some portion of the vectorizable equation should be not executed in the vector AMO functional unit, the vectorized machine executable code includes vectorization code for performing vector operations without using the vector AMO functional unit. - View Dependent Claims (7, 8, 9, 10, 11)
-
-
12. An article comprising a memory having instructions for controlling a computer to compile program code into code for execution by an vector atomic memory operation (AMO) functional unit and by a vector functional unit of a vector computer system, the instructions comprising instructions that:
-
identify a vectorizable equation within the program code; identify a first portion of the vectorizable equation that should be executed using a vector AMO instruction in one of the vector AMO functional units; output a vector AMO instruction to implement the first portion; identify a second portion of the vectorizable equation that should be executed using a vector AMO instruction in one of the vector AMO functional units; and output a vector instruction of a vector functional unit to implement the second portion. - View Dependent Claims (13, 14)
-
-
15. An article comprising a memory having instructions for controlling a computer to compile program code with an iterative loop into code for execution by an vector atomic memory operation (AMO) functional unit and by a vector functional unit of a processor, the instructions comprising instructions that:
-
scan program code to determine whether an operation is vectorizable; if the operation is vectorizable, determine whether a portion of the operation should be executed using a vector AMO instruction in one of the vector AMO functional units;
if the portion of the operation should be executed using a vector AMO instruction in one of the vector AMO functional units, compile the portion of the operation to execute in a vector AMO functional unit; andif the portion of operation should not be executed using a vector AMO instruction in one of the vector AMO functional units, compile the portion of the operation to execute in one or more vector functional units of one or more processors. - View Dependent Claims (16, 17, 18)
-
Specification