DISTRIBUTED TILED CACHING
First Claim
1. A graphics subsystem, comprising:
- a plurality of world-space pipelines, wherein each world-space pipeline is implemented in a different processing entity and is coupled to a crossbar unit; and
a plurality of screen-space pipelines, wherein each screen-space pipeline is implemented in a different processing entity and is coupled to a different corresponding tiling unit in a plurality of tiling units,wherein each tiling unit is configured to receive primitives from the crossbar unit, aggregate the primitives into one or more cache tile batches, and transmit the one or more cache tile batches to the screen-space pipeline corresponding to the tiling unit.
1 Assignment
0 Petitions
Accused Products
Abstract
One embodiment of the present invention sets forth a graphics subsystem configured to implement distributed cache tiling. The graphics subsystem includes one or more world-space pipelines, one or more screen-space pipelines, one or more tiling units, and a crossbar unit. Each world-space pipeline is implemented in a different processing entity and is coupled to a different tiling unit. Each screen-space pipeline is implemented in a different processing entity and is coupled to the crossbar unit. The tiling units are configured to receive primitives from the world-space pipelines, generate cache tile batches based on the primitives, and transmit the primitives to the screen-space pipelines. One advantage of the disclosed approach is that primitives are processed in application-programming-interface order in a highly parallel tiling architecture. Another advantage is that primitives are processed in cache tile order, which reduces memory bandwidth consumption and improves cache memory utilization.
31 Citations
21 Claims
-
1. A graphics subsystem, comprising:
-
a plurality of world-space pipelines, wherein each world-space pipeline is implemented in a different processing entity and is coupled to a crossbar unit; and a plurality of screen-space pipelines, wherein each screen-space pipeline is implemented in a different processing entity and is coupled to a different corresponding tiling unit in a plurality of tiling units, wherein each tiling unit is configured to receive primitives from the crossbar unit, aggregate the primitives into one or more cache tile batches, and transmit the one or more cache tile batches to the screen-space pipeline corresponding to the tiling unit. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computing device, comprising:
a graphics subsystem, comprising; a plurality of world-space pipelines, wherein each world-space pipeline is implemented in a different processing entity and is coupled to a crossbar unit; and a plurality of screen-space pipelines, wherein each screen-space pipeline is implemented in a different processing entity and is coupled to a different corresponding tiling unit, wherein each tiling unit is configured to receive primitives from the crossbar unit, aggregate the primitives into one or more cache tile batches, and transmit the one or more cache tile batches to the screen-space pipeline corresponding to the tiling unit. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20)
-
21. A method for performing distributed cache tiling, comprising:
-
transmitting primitives from a plurality of world-space pipelines to a crossbar unit; receiving primitives from the crossbar unit; aggregating the primitives into one or more cache tile batches; and transmitting the one or more cache tile batches to a screen-space pipeline in a plurality of screen-space pipelines.
-
Specification