Mechanism to accelerate graphics workloads in a multi-core computing architecture
First Claim
Patent Images
1. A processor comprising:
- a first processing core of a processor coupled to a an internal cache of the processor as a direct agent for thread instructions and data; and
a second processing core of the processor coupled to the cache as a direct agent for separate thread instructions and data;
a first programmable integrated circuit (IC) of the processor coupled to the first processing core to execute workloads assigned by the first processing core and coupled to the cache as a direct agent to access data of the first processing core independent of the first processing core to independently execute workloads;
a second programmable IC of the processor coupled to the second processing core to execute workloads assigned by the second processing core and coupled to the cache as a direct agent to access data of the second processing core independent of the second processing core to independently execute workloads, wherein the first and second programmable ICs are field programmable gate arrays (FPGAs) that accelerate execution of performance critical loops in a workload; and
a third processing core coupled to the first programmable IC, wherein the cache comprises a first memory device coupled only to the first processing core, the third processing core and the first programmable IC, and wherein the first programmable IC accelerates execution of workloads processed at the first processing core and the third processing core.
3 Assignments
0 Petitions
Accused Products
Abstract
A processing apparatus is described. The apparatus includes a plurality of processing cores, including a first processing core and a second processing core a first field programmable gate array (FPGA) coupled to the first processing core to accelerate execution of graphics workloads processed at the first processing core and a second FPGA coupled to the second processing core to accelerate execution of workloads processed at the second processing core.
7 Citations
22 Claims
-
1. A processor comprising:
-
a first processing core of a processor coupled to a an internal cache of the processor as a direct agent for thread instructions and data; and a second processing core of the processor coupled to the cache as a direct agent for separate thread instructions and data; a first programmable integrated circuit (IC) of the processor coupled to the first processing core to execute workloads assigned by the first processing core and coupled to the cache as a direct agent to access data of the first processing core independent of the first processing core to independently execute workloads; a second programmable IC of the processor coupled to the second processing core to execute workloads assigned by the second processing core and coupled to the cache as a direct agent to access data of the second processing core independent of the second processing core to independently execute workloads, wherein the first and second programmable ICs are field programmable gate arrays (FPGAs) that accelerate execution of performance critical loops in a workload; and a third processing core coupled to the first programmable IC, wherein the cache comprises a first memory device coupled only to the first processing core, the third processing core and the first programmable IC, and wherein the first programmable IC accelerates execution of workloads processed at the first processing core and the third processing core. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A central processing unit (CPU), comprising:
-
a plurality of processing cores including a first processing core and a second processing core; a first field programmable gate array (FPGA) of the CPU coupled to the first processing core to execute workloads assigned by the first processing core; a first cache memory device of the CPU coupled to the first processing core and the first FPGA, wherein the first processing core and the first FPGA are independently coupled to the first cache memory device as direct agents for independent access to thread instructions and data; a second FPGA of the CPU coupled to the second processing core to execute workloads assigned by the second processing core; a second cache memory device of the CPU coupled to the second processing core and the second FPGA, wherein the second processing core and the second FPGA are independently coupled to the second cache memory device as direct agents for independent access to thread instructions and data; and a third processing core coupled to the first FPGA, wherein the first cache memory device comprises a first memory device coupled only to the first processing core, the third processing core and the first FPGA, and wherein the first FPGA accelerates execution of workloads processed at the first processing core and the third processing core. - View Dependent Claims (14, 15, 16, 17)
-
-
18. A graphics processing unit (GPU), comprising:
-
a plurality of graphics processing cores of the GPU, including a first graphics processing core and a second graphics processing core; a first field programmable gate array (FPGA) of the GPU coupled to the first graphics processing core to execute workloads assigned by the first graphics processing core; a first memory device coupled to the first graphics processing core and the first FPGA, wherein the first graphics processing core and the first FPGA are independently coupled to the first memory device as direct agents for independent access to thread instructions and data; a second FPGA of the GPU coupled to the second graphics processing core to execute workloads assigned by the second graphics processing core; and a second memory device coupled to the second graphics processing core and the second FPGA, wherein the second graphics processing core and the second FPGA are independently coupled to the second memory device as direct agents for independent access to thread instructions and data; and a third processing core coupled to the first FPGA, wherein the first memory device is coupled only to the first graphics processing core, the third processing core and the first FPGA, and wherein the first FPGA accelerates execution of workloads processed at the first graphics processing core and the third processing core. - View Dependent Claims (19, 20, 21, 22)
-
Specification