Instruction culling in graphics processing unit
First Claim
1. A method of processing data with a graphics processing unit (GPU), the method comprising:
- executing, with one or more shader processors of the GPU, a first work item of a first kernel of an application that includes the first kernel and one or more consecutively executed second kernels, wherein the first work item includes one or more instructions for processing input data;
generating, in addition to a result of the first work item, a plurality of cull values based on the result of the first work item of the first kernel, wherein the plurality of cull values indicate whether to execute work items of the one or more second kernels on the input data; and
when the plurality of cull values indicate that the work items of the one or more second kernels are not to be executed, determining not to execute the work items of the one or more second kernels and removing the work items of the one or more second kernels from the instruction stream prior to scheduling the work items to be executed by the one or more shader processors.
1 Assignment
0 Petitions
Accused Products
Abstract
Aspects of the disclosure are directed to a method of processing data with a graphics processing unit (GPU). According to some aspects, the method includes executing a first work item with a shader processor of the GPU, wherein the first work item includes one or more instructions for processing input data. The method also includes generating one or more values based on a result of the first work item, wherein the one or more values represent one or more characteristics of the result. The method also includes determining whether to execute a second work item based on the one or more values, wherein the second work item includes one or more instructions that are distinct from the one or more instructions of the first work item for processing the input data.
30 Citations
25 Claims
-
1. A method of processing data with a graphics processing unit (GPU), the method comprising:
-
executing, with one or more shader processors of the GPU, a first work item of a first kernel of an application that includes the first kernel and one or more consecutively executed second kernels, wherein the first work item includes one or more instructions for processing input data; generating, in addition to a result of the first work item, a plurality of cull values based on the result of the first work item of the first kernel, wherein the plurality of cull values indicate whether to execute work items of the one or more second kernels on the input data; and when the plurality of cull values indicate that the work items of the one or more second kernels are not to be executed, determining not to execute the work items of the one or more second kernels and removing the work items of the one or more second kernels from the instruction stream prior to scheduling the work items to be executed by the one or more shader processors. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. An apparatus for processing data with a graphics processing unit (GPU), the apparatus comprising:
-
one or more shader processors configured to; execute a first work item of the first kernel of an application that includes the first kernel and one or more consecutively executed second kernels that includes one or more instructions for processing input data, and generate, in addition to a result of the first work item, a plurality of cull values based on the result of the first work item of the first kernel, wherein the plurality of cull values indicate whether to execute work items of the one or more second kernels on the input; and a cull module configured to, when the plurality of cull values indicate that the work items of the one or more second kernels are not to be executed, determine not to execute the work items of the one or more second kernels and remove the work items of the one or more second kernels from the instruction stream prior to scheduling the work items to be executed by the one or more shader processors. - View Dependent Claims (8, 9, 10, 11, 12, 13)
-
-
14. A non-transitory computer-readable storage medium encoded with instructions for causing one or more processors of a computing device to:
-
execute, with one or more shader processors of a GPU of the computing device, a first work item of a first kernel of an application that includes the first kernel and one or more consecutively executed second kernels, wherein the first work item includes one or more instructions for processing input data; generate, in addition to a result of the first work item, a plurality of cull values based on the result of the first work item of the first kernel, wherein the plurality of cull values indicate whether to execute work items of the one or more second kernels on the input data; and when the plurality of cull values indicate that the work items of the one or more second kernels are not to be executed, determine not to execute the work items of the one or more second kernels and remove the work items of the one or more second kernels from the instruction stream prior to scheduling the work items to be executed by the one or more shader processors. - View Dependent Claims (15, 16, 17, 18, 19)
-
-
20. An apparatus for processing data with a graphics processing unit (GPU), the apparatus comprising:
-
a means for executing, with one or more shader processors of the GPU, a first work item of a first kernel of an application that includes the first kernel and one or more consecutively executed second kernels, wherein the first work item includes one or more instructions for processing input data; a means for generating, in addition to a result of the first work item, a plurality of cull values based on the result of the first work item of the first kernel, wherein the plurality of cull values indicate whether to execute work items of the one or more second kernels on the input data; and a means for determining, when the plurality of cull values indicate that the work items of the one or more second kernels are not to be executed, not to execute the work items of the one or more second kernels and removing the work items of the one or more second kernels from the instruction stream prior to scheduling the work items to be executed by the one or more shader processors. - View Dependent Claims (21, 22, 23, 24, 25)
-
Specification