Selective thread spawning within a multi-threaded processing system
First Claim
Patent Images
1. A method for selectively spawning thread blocks in a multiprocessing system, the method comprising:
- receiving a request to execute a thread program, wherein the request includes a reference to the thread program to be executed and a reference to a predicate table that includes a plurality of entries corresponding to a plurality of thread indices, the plurality of entries indicating which thread blocks of a thread grid should execute the thread program, and each thread block including one or more threads, each thread to execute the thread program on a parallel processing engine;
initializing one or more loop variables, wherein each loop variable is associated with a different dimension of the predicate table, and a value of each loop variable indicates an index into the predicate table in the associated dimension;
computing a thread index based on the one or more loop variables; and
reading, by a processor, the predicate table at the entry corresponding to the computed thread index to determine whether a thread block associated with the computed thread index should be spawned.
1 Assignment
0 Petitions
Accused Products
Abstract
One embodiment of the present invention sets forth a technique for selectively spawning threads within a multiprocessing system. A computation work distributor (CWD), within the system, is responsible for performing the detailed work needed to spawn a thread grid. A request to the CWD to spawn a thread grid includes a predicate table, which includes an array of flags used to indicate which thread indices should have an associated thread block spawned and which should not. Greater efficiency is achieved by only spawning thread blocks that should perform useful computation.
-
Citations
22 Claims
-
1. A method for selectively spawning thread blocks in a multiprocessing system, the method comprising:
-
receiving a request to execute a thread program, wherein the request includes a reference to the thread program to be executed and a reference to a predicate table that includes a plurality of entries corresponding to a plurality of thread indices, the plurality of entries indicating which thread blocks of a thread grid should execute the thread program, and each thread block including one or more threads, each thread to execute the thread program on a parallel processing engine; initializing one or more loop variables, wherein each loop variable is associated with a different dimension of the predicate table, and a value of each loop variable indicates an index into the predicate table in the associated dimension; computing a thread index based on the one or more loop variables; and reading, by a processor, the predicate table at the entry corresponding to the computed thread index to determine whether a thread block associated with the computed thread index should be spawned. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A non-transitory computer-readable storage medium including instructions that, when executed by a processing unit, cause the processing unit to selectively spawn thread blocks in a multiprocessing system, by performing:
-
receiving a request to execute a thread program, wherein the request includes a reference to the thread program to be executed and a reference to a predicate table that includes a plurality of entries corresponding to a plurality of thread indices, the plurality of entries indicating which thread blocks of a thread grid should execute the thread program, and each thread block including one or more threads, each thread to execute the thread program on a parallel processing engine; initializing one or more loop variables, wherein each loop variable is associated with a different dimension of the predicate table, and a value of each loop variable indicates an index into the predicate table in the associated dimension; computing a thread index based on the one or more loop variables; and reading the predicate table at the entry corresponding to the computed thread index to determine whether a thread block associated with the computed thread index should be spawned. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A system for to selectively spawning thread blocks, comprising:
-
a memory; and a parallel processing unit coupled to the memory and having a multiprocessing architecture and including; a work distribution unit configured to; receive a request to execute a thread program, wherein the request includes a reference to the thread program to be executed and a reference to a predicate table that includes a plurality of entries corresponding to a plurality of thread indices, the plurality of entries indicating which thread blocks of a thread grid should execute the thread program, and each thread block including one or more threads, each thread executing the thread program on a parallel processing engine, initialize one or more loop variables, wherein each loop variable is associated with a different dimension of the predicate table, and a value of each loop variable indicates an index into the predicate table in the associated dimension, compute a thread index based on the one or more loop variables, and read the predicate table at the entry corresponding to the computed thread index to determine whether a thread block associated with the computed thread index should be spawned, a scheduler coupled to the work distribution unit, and a plurality of processing engines coupled to the scheduler. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22)
-
Specification