LOAD SYNCHRONIZATION WITH STREAMING THREAD COHORTS
First Claim
1. A processor implemented method for controlling a lock-stepped cohort, the method comprising:
- receiving instructions for each of a first lane and a second lane, the first lane for the lock-stepped cohort and the second lane for another cohort;
detecting a condition in which a first instruction at the first lane will have a higher latency than a second instruction at the second lane;
setting an indicator indicating where the first lane encountered the first instruction;
setting the first lane to inactive, while keeping the second lane active; and
setting the first lane to active on a subsequent opportunity to execute said first instruction.
1 Assignment
0 Petitions
Accused Products
Abstract
There is provided a processor implemented method for controlling a lock-stepped cohort. The method includes receiving instructions for each of a first lane and a second lane. The first lane is for the lock-stepped cohort and the second lane is for another cohort. The method further includes detecting a condition in which a first instruction at the first lane will have a higher latency than a second instruction at the second lane. The method also includes setting an indicator indicating where the first lane encountered the first instruction. The method additionally includes setting the first lane to inactive, while keeping the second lane active. The method further includes setting the first lane to active on a subsequent opportunity to execute said first instruction.
-
Citations
20 Claims
-
1. A processor implemented method for controlling a lock-stepped cohort, the method comprising:
-
receiving instructions for each of a first lane and a second lane, the first lane for the lock-stepped cohort and the second lane for another cohort; detecting a condition in which a first instruction at the first lane will have a higher latency than a second instruction at the second lane; setting an indicator indicating where the first lane encountered the first instruction; setting the first lane to inactive, while keeping the second lane active; and setting the first lane to active on a subsequent opportunity to execute said first instruction. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. An apparatus for controlling a lock-stepped cohort, the apparatus comprising:
-
a processor for receiving instructions for each of a first lane and a second lane, for detecting a condition in which a first instruction at the first lane will have a higher latency than a second instruction at the second lane, for setting an indicator indicating where the first lane encountered the first instruction, for setting the first lane to inactive, while keeping the second lane active, and for setting said first lane to active on a subsequent opportunity to execute said first instruction, wherein the first lane is for the lock-stepped cohort and the second lane is for another cohort. - View Dependent Claims (9, 10, 11, 12)
-
-
13. A processor implemented method for improving performance through data pre-loading for first and second lock-stepped cohort members of which only the first member needs to loads data on demand and the second member pre-loads data in lock-step with the first member, the method comprising:
-
setting a shared flag if any of the members need to issue a demand-load for a next buffer; issuing the demand-load by both of the members, when the shared flag is set by both of the members; examining whether a buffer and the data of the second member are ready for a pre-load, when the shared flag is set by only the first member; reading the data by the second member in lock step with the first member, when the shared flag is set and the pre-load is deemed possible; issuing no operation by the second member and issuing the demand-load by the first member, when the shared flag is set and the pre-load is deemed not possible; resetting the shared flag, when the demand-load is completed; and setting a status bit to indicate the buffer for the second member has been pre-loaded, when the pre-load is complete. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
Specification