Compiler-generated invocation stubs for data parallel programming model
First Claim
Patent Images
1. A method comprising:
- obtaining a representation of a call for a data parallel (DP) function, wherein the representation includes indicators of arguments associated with the call for the DP function;
generating an invocation stub, based at least in part upon the representation and the associated arguments, the invocation stub including computer executable instructions that bridge a logical arrangement of DP computations of the DP function to a physical arrangement of DP computations to be performed on DP hardware of one or more computing devices by calculating logic indices of each DP activity based on physical thread indices and a logical compute domain.
2 Assignments
0 Petitions
Accused Products
Abstract
Described herein are techniques for generating invocation stubs for a data parallel programming model so that a data parallel program written in a statically-compiled high-level programming language may be more declarative, reusable, and portable than traditional approaches. With some of the described techniques, invocation stubs are generated by a compiler and those stubs bridge a logical arrangement of data parallel computations to the actual physical arrangement of a target data parallel hardware for that data parallel computation.
87 Citations
20 Claims
-
1. A method comprising:
-
obtaining a representation of a call for a data parallel (DP) function, wherein the representation includes indicators of arguments associated with the call for the DP function; generating an invocation stub, based at least in part upon the representation and the associated arguments, the invocation stub including computer executable instructions that bridge a logical arrangement of DP computations of the DP function to a physical arrangement of DP computations to be performed on DP hardware of one or more computing devices by calculating logic indices of each DP activity based on physical thread indices and a logical compute domain. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. One or more computer-readable storage devices storing computer-executable instructions that, when executed, cause one or more computer to perform operations, the operations comprising:
-
obtaining a representation of a call for a DP function, wherein the representation includes indicators of arguments associated with the call for the DP function; generating an invocation stub, based at least in part upon the representation and the associated arguments, the invocation stub including computer executable instructions that bridge a logical arrangement of DP computations of the DP function to a physical arrangement of DP computations to be performed on DP hardware of the one or more computing devices by calculating logic indices of each DP activity based on physical thread indices and a logical compute domain, the DP hardware being capable of performing DP computations and the logical arrangement of the DP computations being defined, at least in part, by the representation of the call for the DP function with its associated arguments, the generating comprising; determining a physical location in the DP hardware where the DP computations will occur for the logical arrangement of the DP computation; choosing an appropriate thread deployment strategy; setting up a field for the DP function with target-dependent resources, wherein the field comprises a data set operated upon by the DP computations of the DP function, the setting up including; examining the arguments associated with the DP function and configuring the data set based upon the examining; declaring how many threads inside a thread group are to be deployed in multiple dimensions; preparing parameters of a kernel based upon what the kernel expects, the kernel being a unit DP computation of the DP function, the preparing including; determining whether the associated arguments include one or more special indices; in response to the determining regarding the one or more special indices, creating index instances based upon the one or more special index types; determining whether the associated arguments match actual parameters; in response to the determining regarding the associated arguments matching actual parameters, broadcasting a value for the kernel to use or projecting a field, wherein the field is a data set operated upon by the DP computations of the DP function.
-
-
15. A system comprising:
-
a data parallel (DP) compute engine comprising one or more graphics processing units; and a non-DP host configured with instructions that are executable to perform acts comprising; obtaining a representation of a call for a DP function, wherein the representation includes indicators of arguments associated with the call for the DP function; generating an invocation stub, based at least in part upon the representation and the associated arguments, the invocation stub including computer executable instructions that bridge a logical arrangement of DP computations of the DP function to a physical arrangement of DP computations to be performed on DP hardware of the one or more computing devices, the DP hardware being capable of performing DP computations and the logical arrangement of the DP computations being defined by the representation of the call for a DP function with its associated arguments; the generating comprising; setting up a field for the DP function with target-dependent resources, wherein the field comprises a data set operated upon by the DP computations of the DP function; preparing parameters of a kernel based upon what the kernel expects, the kernel being a unit DP computation of the DP function; and outputting the invocation stub into a memory. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification