Across-thread out-of-order instruction dispatch in a multithreaded microprocessor

US 7,676,657 B2
Filed: 10/10/2006
Issued: 03/09/2010
Est. Priority Date: 12/18/2003
Status: Active Grant

First Claim

Patent Images

1. A method for executing a plurality of threads in a multithreaded processor, the method comprising:

defining a plurality of threads, wherein each thread executes a sequence of program instructions and at least a subset of the plurality of threads are of different types;

fetching a first instruction for a first one of the plurality of threads;

fetching a second instruction for a second one of the plurality of threads, the second thread of a first type comprising a vertex thread type;

issuing the first instruction, wherein the first instruction has a latency period associated therewith; and

during the latency period associated with the first instruction, issuing the second instruction based at least in part on a priority ranking associated with the first type,wherein the first instruction and the second instruction are issued in an order independent of an order of fetching the first and second instructions, and wherein the second instruction is issued before one or more other instructions ready to issue for a longer duration than the second instruction, the one or more other instructions for a thread of a second type comprising a pixel thread type, wherein the pixel thread type is associated with a lower priority ranking than the priority ranking associated with the vertex thread type.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Instruction dispatch in a multithreaded microprocessor such as a graphics processor is not constrained by an order among the threads. Instructions for each thread are fetched, and a dispatch circuit determines which instructions in the buffer are ready to execute. The dispatch circuit may issue any ready instruction for execution, and an instruction from one thread may be issued prior to an instruction from another thread regardless of which instruction was fetched first. If multiple functional units are available, multiple instructions can be dispatched in parallel.

Citations

19 Claims

1. A method for executing a plurality of threads in a multithreaded processor, the method comprising:
- defining a plurality of threads, wherein each thread executes a sequence of program instructions and at least a subset of the plurality of threads are of different types;
  
  fetching a first instruction for a first one of the plurality of threads;
  
  fetching a second instruction for a second one of the plurality of threads, the second thread of a first type comprising a vertex thread type;
  
  issuing the first instruction, wherein the first instruction has a latency period associated therewith; and
  
  during the latency period associated with the first instruction, issuing the second instruction based at least in part on a priority ranking associated with the first type,wherein the first instruction and the second instruction are issued in an order independent of an order of fetching the first and second instructions, and wherein the second instruction is issued before one or more other instructions ready to issue for a longer duration than the second instruction, the one or more other instructions for a thread of a second type comprising a pixel thread type, wherein the pixel thread type is associated with a lower priority ranking than the priority ranking associated with the vertex thread type.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1 wherein the first thread and the second thread execute different programs.
  - 3. The method of claim 1 wherein the first thread and the second thread execute the same program on different input data.
  - 4. The method of claim 3 wherein the first instruction and the second instruction are instructions from different portions of the same program.
  - 5. The method of claim 1 wherein the first instruction is issued to a first functional unit in the multithreaded processor and the second instruction is issued to a second functional unit in the multithreaded processor.
  - 6. The method of claim 1 further comprising:
    - during the latency period associated with the first instruction, issuing a third instruction comprising a selected one of the one or more other instructions, wherein the first instruction, the second instruction, and the third instruction are issued in an order independent of an order of instruction fetch.
  - 7. The method of claim 6 wherein the third instruction is an instruction for a third one of the plurality of threads.
  - 8. The method of claim 6 wherein the third instruction is an instruction for the first one of the plurality of threads.
  - 9. The method of claim 8 wherein the first instruction and the third instruction are issued during consecutive processing cycles.
  - 10. The method of claim 6 wherein the third instruction and the second instruction are issued in parallel.

11. A method for executing a plurality of threads in a multithreaded processor, the method comprising:
- defining a plurality of threads, wherein each thread executes a sequence of program instructions and at least a subset of the plurality of threads are of different types;
  
  fetching a plurality of instructions, including;
  
  a first instruction for a first one of the plurality of threads, the first thread of a first type;
  
  a second instruction for a second one of the plurality of threads, the second thread of the first type comprising a vertex thread type; and
  
  a third instruction for a third one of the plurality of threads, the third thread of a second type comprising a pixel thread type, wherein the first instruction is fetched subsequently to the third instruction;
  
  issuing the first instruction to a first functional unit in the multithreaded processor prior to issuing the third instruction, based at least in part on a priority ranking associated with the first type; and
  
  in parallel with issuing the first instruction, issuing the second instruction to a second functional unit in the multithreaded processor,wherein the third instruction was ready to issue for a longer duration than the first instruction when the first instruction issued, and the pixel thread type is associated with a lower priority ranking than the priority ranking associated with the vertex thread type.
- View Dependent Claims (12, 13, 14)
- - 12. The method of claim 11 wherein the first thread and the second thread execute different programs.
  - 13. The method of claim 11 wherein the first thread and the second thread execute a same program on different input data, and wherein the first instruction and the second instruction are different instructions from the same program.
  - 14. The method of claim 11 further comprising:
    - issuing the third instruction during a latency period associated with one of the first instruction or the second instruction.

15. A microprocessor configured for parallel processing of a plurality of threads, wherein each thread executes a sequence of program instructions, the microprocessor comprising:
- an execution module adapted to execute instructions for all of the plurality of threads, wherein at least a subset of the plurality of threads are of different types;
  
  a fetch circuit adapted to fetch instructions from a sequence of program instructions for each of the plurality of threads; and
  
  an issue circuit adapted to issue the instructions fetched by the fetch circuit to the execution module, wherein the instructions for different ones of the plurality of threads are issued in an order based at least in part on priority rankings based on respective thread types of the different threads and independent of an order in which the instructions for the different ones of the plurality of threads were fetched, wherein a pixel thread type is associated with a lower priority ranking than a priority ranking associated with a vertex thread type,the issue circuit being further adapted such that, during a latency period associated with a first issued instruction for a first one of the threads comprising a pixel thread, the issue circuit issues at least one instruction for a second one of the threads comprising a vertex thread.
- View Dependent Claims (16, 17, 18, 19)
- - 16. The microprocessor of claim 15 wherein the execution module includes a plurality of functional units and wherein the issue circuit is further adapted to issue at least two instructions in parallel, each of the instructions issued in parallel being directed to a different one of the functional units.
  - 17. The microprocessor of claim 16 wherein the issue circuit is further adapted such that each of the instructions issued in parallel is for a different one of the plurality of threads.
  - 18. The microprocessor of claim 16 wherein a maximum number of instructions issuable in parallel is less than the number of functional units in the execution module.
  - 19. The microprocessor of claim 15 wherein the fetch circuit is further adapted to fetch a subsequent instruction for a first thread in response to the issue circuit issuing a previously fetched instruction for the first thread.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
NVIDIA Corporation
Original Assignee
NVIDIA Corporation
Inventors
Moy, Simon S., Lindholm, John Erik, Coon, Brett
Primary Examiner(s)
Treat; William M

Application Number

US11/548,272
Publication Number

US 20070214343A1
Time in Patent Office

1,246 Days
Field of Search

345/613, 712/215, 712/216, 712/220, 718/100, 718/102
US Class Current

712/220
CPC Class Codes

G06F 9/3802   Instruction prefetching

G06F 9/3851   from multiple instruction s...

G06F 9/3888   controlled by a single inst...

Across-thread out-of-order instruction dispatch in a multithreaded microprocessor

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

19 Claims

Specification

Solutions

Use Cases

Quick Links

Across-thread out-of-order instruction dispatch in a multithreaded microprocessor

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

19 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links