Handling cache misses by selectively flushing the pipeline
First Claim
1. A single chip multithreaded processor comprising:
- at least one processor core comprising a plurality of resources for forming a pipeline that generates one or more load instructions, said processor core comprising;
an instruction fetch unit for providing instructions to said pipeline, said instruction fetch unit comprising thread detection logic for monitoring the status of each thread in the pipeline; and
a cache unit for servicing load instructions from the pipeline;
wherein there are a plurality of threads in the pipeline simultaneously,wherein said pipeline is arranged to switch from a first thread of the plurality of threads containing a first load instruction that misses the cache unit to a second thread, and, only when the thread detection logic detects a follow-on instruction in the pipeline from the first thread to the first load instruction, flush the first thread, andwherein said instruction fetch unit places the first thread in a wait state without flushing the first thread when the first load instruction misses the cache unit and the thread detection logic detects no instructions in the pipeline that are from the first thread and subsequent to the first load instruction.
2 Assignments
0 Petitions
Accused Products
Abstract
An apparatus and method for efficiently managing data cache load misses is described in connection with a multithreaded, pipelined multiprocessor chip. A CMT processor keeps track of load misses for each thread by issuing a load miss signal each time a load instruction to the data cache misses. A detection logic functionality in the IFU responds the load miss signal to determine if a valid instruction from the thread is at the one of the pipeline stages. If no instructions from the thread are detected in the pipeline, then no flush is required and the thread is placed in a wait state until the requested data is returned from higher order memory. If any instruction from the thread is detected in the pipeline, the thread is flushed and the instruction is re-fetched.
-
Citations
21 Claims
-
1. A single chip multithreaded processor comprising:
-
at least one processor core comprising a plurality of resources for forming a pipeline that generates one or more load instructions, said processor core comprising; an instruction fetch unit for providing instructions to said pipeline, said instruction fetch unit comprising thread detection logic for monitoring the status of each thread in the pipeline; and a cache unit for servicing load instructions from the pipeline; wherein there are a plurality of threads in the pipeline simultaneously, wherein said pipeline is arranged to switch from a first thread of the plurality of threads containing a first load instruction that misses the cache unit to a second thread, and, only when the thread detection logic detects a follow-on instruction in the pipeline from the first thread to the first load instruction, flush the first thread, and wherein said instruction fetch unit places the first thread in a wait state without flushing the first thread when the first load instruction misses the cache unit and the thread detection logic detects no instructions in the pipeline that are from the first thread and subsequent to the first load instruction. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 18, 19)
-
-
9. A processor system comprising at least one pipelined processing element and a cache memory, said pipelined processing element comprising:
-
a plurality of pipeline stages that generates load instructions to the cache memory for at least a first thread; a memory controller for generating a cache miss signal when a load instruction in the first thread misses the cache memory; and means for switching from the first thread to a second thread in response to the cache miss signal and, only if an instruction after the load instruction in the first thread is at any of the plurality of pipeline stages, flushing instructions of the first thread from the plurality of pipeline stages, wherein there are a plurality of threads in the plurality of pipeline stages simultaneously, and wherein the means for flushing suppresses the flushing of the first thread from the plurality of stages in response to the cache miss signal if no instruction after the load instruction in the first thread is detected at any of the plurality of pipeline stages. - View Dependent Claims (10, 11, 12, 13, 14, 15, 20)
-
-
16. In a multithreaded processor comprising a cache memory and an instruction fetch unit for issuing instructions to a pipeline, a method for handling cache misses, comprising:
-
processing a plurality of threads with a pipelined processor core comprising a plurality of pipeline stages; issuing a memory request from a first thread to the cache memory; issuing a cache miss signal if the memory request from the first thread misses the cache memory;
in response to the cache miss signal, switching to a second thread and, only when one or more of the plurality of pipeline stages contains an instruction from the first thread that is subsequent to the memory request, flushing from the plurality of pipeline stages all instructions in the first thread that are subsequent to the memory request; andentering the first thread into a wait state and suppressing the flushing of the first thread from the plurality of pipeline stages, when none of the plurality of pipeline stages contains an instruction from the first thread subsequent to the memory request, wherein there are a plurality of threads in the plurality of pipeline stages simultaneously. - View Dependent Claims (17, 21)
-
Specification