Parallel data processing systems and methods using cooperative thread arrays and SIMD instruction issue
First Claim
1. A computer-implemented method for processing data, the method comprising:
- defining, in a multithreaded processor having a plurality of processing engines configured to execute threads in single-instruction, multiple-data (SIMD) groups, a thread array having a plurality of threads, each thread configured to execute a same program on an input data set, wherein the SIMD groups each have a degree of parallelism P;
launching the threads of the thread array in one or more SIMD groups, wherein launching each SIMD group includes;
assigning a unique thread identifier value to each thread in the SIMD group, wherein each unique thread identifier value is unique within the SIMD group; and
signaling the parallel processing engines to begin executing the SIMD group; and
executing the one or more SIMD groups concurrently with each other,wherein during execution, each thread of the thread array uses the unique thread identifier value assigned thereto as an input to compute at least one function specified by the same program, and an intermediate result from a first one of the threads of the SIMD group is shared with a second one of the threads of the SIMD group based on the respective thread identifiers of the first and second threads.
1 Assignment
0 Petitions
Accused Products
Abstract
Parallel data processing systems and methods use cooperative thread arrays (CTAs), i.e., groups of multiple threads that concurrently execute the same program on an input data set to produce an output data set. Each thread in a CTA has a unique identifier (thread ID) that can be assigned at thread launch time and that controls various aspects of the thread'"'"'s processing behavior, such as the portion of the input data set to be processed by each thread, the portion of the output data set to be produced by each thread, and/or sharing of intermediate results among threads. Where groups of threads are executed in SIMD parallelism, thread IDs for threads in the same SIMD group are generated and assigned in parallel, allowing different SIMD groups to be launched in rapid succession.
-
Citations
24 Claims
-
1. A computer-implemented method for processing data, the method comprising:
-
defining, in a multithreaded processor having a plurality of processing engines configured to execute threads in single-instruction, multiple-data (SIMD) groups, a thread array having a plurality of threads, each thread configured to execute a same program on an input data set, wherein the SIMD groups each have a degree of parallelism P; launching the threads of the thread array in one or more SIMD groups, wherein launching each SIMD group includes; assigning a unique thread identifier value to each thread in the SIMD group, wherein each unique thread identifier value is unique within the SIMD group; and signaling the parallel processing engines to begin executing the SIMD group; and executing the one or more SIMD groups concurrently with each other, wherein during execution, each thread of the thread array uses the unique thread identifier value assigned thereto as an input to compute at least one function specified by the same program, and an intermediate result from a first one of the threads of the SIMD group is shared with a second one of the threads of the SIMD group based on the respective thread identifiers of the first and second threads. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 20, 21)
-
-
12. A processor comprising:
-
a processing core including a plurality of processing engines configured to execute threads in single-instruction, multiple-data (SIMD) groups, each SIMD group having a degree of parallelism P; and core interface logic coupled to the processing core and configured to initiate execution by the processing core of a thread array having a plurality of threads, each thread configured to execute a same program on an input data set, the core interface logic including a launch module configured to launch the threads of the thread array in one or more SIMD groups, wherein launching each SIMD group includes; assigning a unique thread identifier value to each thread in the SIMD group, wherein each unique thread identifier value is unique within the SIMD group; and signaling the processing engines to begin executing the SIMD group, wherein during execution, each thread of the thread array uses the unique thread identifier value assigned thereto as an input to compute at least one function specified by the same program, and an intermediate result from a first one of the threads of the SIMD group is shared with a second one of the threads of the SIMD group based on the respective thread identifiers of the first or second threads. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 22, 23)
-
-
24. A computer-implemented method for processing data, the method comprising:
-
defining, in a multithreaded processor having a plurality of processing engines configured to execute threads in single-instruction, multiple-data (SIMD) groups, a thread array having a plurality of threads, each thread configured to execute a same program on an input data set, wherein the SIMD groups each have a degree of parallelism P; and launching the threads of the thread array in one or more SIMD groups, wherein launching each SIMD group includes; assigning a unique thread identifier value to each thread in the SIMD group, wherein each unique thread identifier value is unique within the SIMD group; signaling the parallel processing engines to begin executing the SIMD group; and executing the one or more SIMD groups concurrently with each other, wherein the executing includes; generating an intermediate result from a first one of the threads of the SIMD group; and sharing the intermediate result with a second one of the threads of the SIMD group based on the respective thread identifiers of the first and second threads.
-
Specification