Warp clustering
First Claim
1. A method of reducing power consumption in a shader of a graphics processing system, the method comprising:
- organizing a vector register file into a plurality of segments of physical memory, with each segment having an active mode and a reduced power data retention mode independently selectable from other segments of the vector register file;
allocating each of the segments as a resource for a respective one of a plurality of clusters of multiple shader units of work assigned to a processor and having temporal locality and spatial locality;
scheduling execution of the clusters in a sequence; and
placing each of the segments that are respectively associated with the clusters that are in an inactive state into the reduced power data retention mode during at least a portion of a latency period for a texture load for the clusters.
1 Assignment
0 Petitions
Accused Products
Abstract
Units of shader work, such as warps or wavefronts, are grouped into clusters. An individual vector register file of a processor is operated as segments, where a segment may be independently operated in an active mode or a reduced power data retention mode. The scheduling of the clusters is selected so that a cluster is allocated a segment of the vector register file. Additional sequencing may be performed for a cluster to reach a synchronization point. Individual segments are placed into the reduced power data retention mode during a latency period when the cluster is waiting for execution of a request, such as a sample request.
71 Citations
18 Claims
-
1. A method of reducing power consumption in a shader of a graphics processing system, the method comprising:
-
organizing a vector register file into a plurality of segments of physical memory, with each segment having an active mode and a reduced power data retention mode independently selectable from other segments of the vector register file; allocating each of the segments as a resource for a respective one of a plurality of clusters of multiple shader units of work assigned to a processor and having temporal locality and spatial locality; scheduling execution of the clusters in a sequence; and placing each of the segments that are respectively associated with the clusters that are in an inactive state into the reduced power data retention mode during at least a portion of a latency period for a texture load for the clusters. - View Dependent Claims (2, 3, 4, 5, 6, 7, 10)
-
-
8. A method of reducing power consumption in a shader of a graphics processing system, the method comprising:
-
scheduling clusters of shader work for a plurality of processors, each cluster including a plurality of shader units of work assigned to a processor and having temporal locality and spatial locality; for each cluster, allocating a respective segment of physical memory of a vector register file as a resource, each segment having an active mode and a reduced power data retention mode independently selectable from other segments; scheduling execution of the clusters in a sequence; rotating execution of the clusters; and placing segments of inactive clusters into the reduced power data retention mode during at least a portion of a latency period for a texture load for the inactive clusters. - View Dependent Claims (9, 11, 12, 13, 14)
-
-
15. A graphics processing unit, comprising:
-
a plurality of programmable processors to perform Single Instruction Multiple Thread (SIMT) processing of shading instructions, each programmable processor including a vector register file having a plurality of data segments, each segment having an active mode and a reduced power data retention mode independently selectable from other segments; a scheduler to schedule clusters of shader work for the plurality of programmable processors, each cluster including a plurality of shader units of work assigned to an individual processor and having temporal locality and spatial locality, with each cluster supported by a segment of the vector register file of the assigned individual processor, the scheduler for selecting a schedule to rotate execution of the clusters to place segments of inactive clusters into the reduced power data retention mode during at least a portion of a latency period associated with an operation request by the cluster; and an external memory comprising a texture unit, wherein segments of inactive clusters are placed in the reduced power data retention mode during at least a portion of a latency period associated with accessing the external memory for a texture access of a cluster. - View Dependent Claims (16, 17)
-
-
18. A graphics processing unit, comprising:
-
a shader including a programmable processing element; a vector register file used as a resource for units of shader work in which each unit of shader work has a group of shader threads to perform Single Instruction Multiple Thread (SIMT) processing and multiple groups of shader threads are formed into a cluster, the vector register file allocated as a plurality of individual segments; a scheduler to group clusters of units of shader work and select a schedule to assign an individual cluster to a segment of the vector register file and place the segment into a reduced power data retention mode during a latency period when the cluster is waiting for a result of a sample request during at least a portion of a latency period associated with an operation request by the cluster; and an external memory comprising a texture unit, wherein segments of inactive clusters are placed in the reduced power data retention mode during at least a portion of a latency period associated with accessing the external memory for a texture access of a cluster.
-
Specification