Multiple-thread processor with single-thread interface shared among threads
First Claim
1. A processor comprising:
- a multiple-thread execution pipeline including a plurality of functional units allocated to an execution thread of a plurality of execution threads; and
a single-thread interface coupled to the plurality of processing units, the single-thread interface being shared among threads and maintaining thread compatibility by physical duplication of structures and by verifying communication status after thread transfer, wherein;
the multiple-thread execution pipeline includes a plurality of pulse-based high-speed multiple-bits flip-flops, the pulse-based high-speed multiple-bits flip-flops having a latch structure coupled to a plurality of select-bus lines, the select-bus lines selecting data in the pulsed-based high-speed multiple-bits flip-flops corresponding to an active thread from among the plurality of execution threads.
2 Assignments
0 Petitions
Accused Products
Abstract
A processor includes logic for tagging a thread identifier (TID) for usage with processor blocks that are not stalled. Pertinent non-stalling blocks include caches, translation look-aside buffers (TLB), a load buffer asynchronous interface, an external memory management unit (MMU) interface, and others. A processor includes a cache that is segregated into a plurality of N cache parts. Cache segregation avoids interference, “pollution”, or “cross-talk” between threads. One technique for cache segregation utilizes logic for storing and communicating thread identification (TID) bits. The cache utilizes cache indexing logic. For example, the TID bits can be inserted at the most significant bits of the cache index.
-
Citations
24 Claims
-
1. A processor comprising:
-
a multiple-thread execution pipeline including a plurality of functional units allocated to an execution thread of a plurality of execution threads; and
a single-thread interface coupled to the plurality of processing units, the single-thread interface being shared among threads and maintaining thread compatibility by physical duplication of structures and by verifying communication status after thread transfer, wherein;
the multiple-thread execution pipeline includes a plurality of pulse-based high-speed multiple-bits flip-flops, the pulse-based high-speed multiple-bits flip-flops having a latch structure coupled to a plurality of select-bus lines, the select-bus lines selecting data in the pulsed-based high-speed multiple-bits flip-flops corresponding to an active thread from among the plurality of execution threads. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
the single-thread interface includes a load buffer and a store buffer that maintain compatibility with multiple threads so that, on a thread switch, the single-thread interface receives a new thread and maintains the state of a shared structure in a manner that is compatible with the replaced thread.
-
-
3. A processor according to claim 1 wherein:
the single-thread interface includes a load buffer and a store buffer that maintain compatibility with multiple threads by checking read-after-write status of the load buffer and the store buffer.
-
4. A processor according to claim 1 wherein:
the single-thread interface includes a load buffer and a store buffer that maintain compatibility with multiple threads by checking load operations against contents of a store buffer in an alternative thread so that read-after-write status information is stored and augmented to store results of read-after-write checks against content of all store buffers.
-
5. A processor according to claim 1 wherein:
the single-thread interface identifies a tag using a thread identifier (TID) tag.
-
6. A processor according to claim 1 wherein:
the single-thread interface is selected from among devices including caches, translation look-aside buffers, load buffer asynchronous interfaces, store buffer asynchronous interfaces, and memory management units.
-
7. A processor according to claim 1 wherein:
the single-thread interface is selected from among non-stalling devices including caches, translation look-aside buffers, load buffer asynchronous interfaces, store buffer asynchronous interfaces, and memory management units.
-
8. A processor according to claim 1 further comprising:
a plurality of multiple-thread execution pipelines and a single-thread interface integrated onto a single integrated-circuit chip.
-
9. A processor comprising:
-
a multiple-thread execution pipeline including a plurality of execution pathways respectively allocated to a plurality of execution threads; and
a single-pathway component coupled to the multiple-thread execution pathways so that the plurality of execution pathways converge into the single-pathway of the single-pathway component, the single-pathway component being non-stalling component wherein;
the multiple-thread execution pipeline includes a plurality of pulse-based high-speed multiple-bits flip-flops, the pulsed-based high-speed multiple-bits flip-flops having a latch structure coupled to a plurality of select-bus lines, the select-bus lines selecting data in the pulsed-based high-speed multiple-bits flip-flops corresponding to an active thread from among the plurality of execution threads. - View Dependent Claims (10, 11, 12, 13)
the single-pathway component is shared among a plurality of threads, the single-pathway component maintaining compatibility among threads by physical duplication of structures and by verifying communication status after transfer of a thread.
-
-
11. A processor according to claim 9 wherein:
the single-pathway component identifies a tag using a thread identifier (TID) tag.
-
12. A processor according to claim 9 wherein:
the single-pathway component is selected from among devices including caches, translation look-aside buffers, load buffer asynchronous interfaces, store buffer asynchronous interfaces, and memory management units.
-
13. A processor according to claim 9 further comprising:
a plurality of multiple-thread execution pipelines and the single-pathway component integrated onto a single integrated-circuit chip.
-
14. A processor comprising:
-
a multiple-thread execution pipeline including a plurality of execution pathways respectively allocated to a plurality of execution threads; and
a single-thread cache coupled to the multiple-thread execution pipeline so that the plurality of execution pathways converge into the single-thread of the cache, the single-thread cache being shared among threads and maintaining thread compatibility by segregation of the cache into N parts, wherein;
the multiple-thread execution pipeline includes a plurality of pulse-based high-speed multiple-bits flip-flops, the pulsed-based high-speed multiple-bits flip-flops having a latch structure coupled to a plurality of select-bus lines, the select-bus lines selecting data in the pulsed-based high-speed multiple-bits flip-flops corresponding to an active thread from among the plurality of execution threads. - View Dependent Claims (15, 16, 17, 18, 19, 20)
cache segregation separates the cache into N independent parts that are allocated to threads to avoid pollution, “
cross-talk”
, and interference between threads.
-
-
16. A processor according to claim 14 wherein the cache includes:
a cache index that allocates the threads into the N independent cache parts.
-
17. A processor according to claim 16 wherein:
the cache index includes a bit field allocated to received thread identification (TID) bits indicative of a part of the N parts of the segregated cache.
-
18. A processor according to claim 14 further comprising:
a thread switch logic coupled to the multiple-thread execution pipeline and coupled to the cache, the thread switch logic controlling thread selection and generating a thread identifier (TID) indicative of the selected thread.
-
19. A processor according to claim 14 further comprising:
-
a thread switch logic coupled to the multiple-thread execution pipeline and coupled to the cache, the thread switch logic controlling thread selection and generating a thread identifier (TID) indicative of the selected thread; and
a thread control logic coupled to the thread switch logic and supporting lightweight processes and native threads, the thread control logic disabling thread ID tagging and disabling cache segregation for lightweight processes and native threads that share a single virtual tag space.
-
-
20. A processor according to claim 14 further comprising:
a plurality of multiple-thread execution pipelines and the cache integrated onto a single integrated-circuit chip.
-
21. A processor comprising:
-
a multiple-thread execution pipeline including a plurality of execution pathways respectively allocated to a plurality of execution threads; and
a non-stalling component coupled to the multiple-thread execution pathways so that the plurality of execution pathways converge into a single-pathway including the non-stalling component, wherein;
the multiple-thread execution pipeline includes a plurality of pulse-based high-speed multiple-bits flip-flops, the pulsed-based high-speed multiple-bits flip-flops having a latch structure coupled to a plurality of select-bus lines, the select-bus lines selecting data in the pulsed-based high-speed multiple-bits flip-flops corresponding to an active thread from among the plurality of execution threads. - View Dependent Claims (22, 23, 24)
the non-stalling component is selected from non-stalling components including caches, translation look-aside buffers (TLBs), load buffer asynchronous interfaces, and external MMU interface.
-
-
23. A processor according to claim 21 further comprising:
thread tagging logic coupled to the non-stalling component, the thread tagging logic for setting a thread identifier (TID) tag identifying threads in the non-stalling component.
-
24. A processor according to claim 21 further comprising:
a plurality of multiple-thread execution pipelines and the shared components integrated onto a single integrated-circuit chip.
Specification