Relaxed memory consistency model
First Claim
1. A computer processing method comprising:
- providing a computer system having a shared memory and a multistream processor (MSP), wherein the MSP includes a plurality of single stream processors (SSPs) including a first SSP, each one of the plurality of SSPs having a scalar section and one or more vector sections, wherein each of the plurality of SSPs is operatively coupled to the memory;
defining program order between operations on the first SSP;
defining operation dependence order of vector memory references to the memory with respect to each other and with respect to scalar memory references to the memory;
maintaining a minimal guarantee on the ordering using an active list located in the scalar section, wherein maintaining includes;
placing each instruction in order in the active list, wherein placing includes initializing each instruction to a speculative status;
determining if the speculative status instruction is branch speculative or trap speculative;
if the speculative status instruction is neither branch speculative nor trap speculative, checking to see if all scalar operands for the speculative status instruction are present;
if all scalar operands for the speculative status instruction are present, moving the speculative status instruction to a “
scalar committed”
status and issuing a scalar commitment notice from the active list to the one or more vector sections;
checking to see if all vector operands for the scalar committed status instruction are present;
if all vector operands for the scalar committed status instruction are present, moving the scalar committed status instruction to a “
committed”
status;
checking to see if all instructions previous to the committed status instruction are completed; and
if all instructions previous to the committed status instruction are completed, moving the committed status instruction to a “
graduated”
status; and
maintaining memory consistency between multiple vector memory references and between vector and scalar memory references by guaranteeing no vector store reference can be sent to memory prior to a scalar or vector load that occurs earlier in the program order.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and apparatus to provide specifiable ordering between and among vector and scalar operations within a single streaming processor (SSP) via a local synchronization (Lsync) instruction that operates within a relaxed memory consistency model. Various aspects of that relaxed memory consistency model are described. Further, a combined memory synchronization and barrier synchronization (Msync) for a multistreaming processor (MSP) system is described. Also, a global synchronization (Gsync) instruction provides synchronization even outside a single MSP system is described. Advantageously, the pipeline or queue of pending memory requests does not need to be drained before the synchronization operation, nor is it required to refrain from determining addresses for and inserting subsequent memory accesses into the pipeline.
-
Citations
16 Claims
-
1. A computer processing method comprising:
-
providing a computer system having a shared memory and a multistream processor (MSP), wherein the MSP includes a plurality of single stream processors (SSPs) including a first SSP, each one of the plurality of SSPs having a scalar section and one or more vector sections, wherein each of the plurality of SSPs is operatively coupled to the memory; defining program order between operations on the first SSP; defining operation dependence order of vector memory references to the memory with respect to each other and with respect to scalar memory references to the memory; maintaining a minimal guarantee on the ordering using an active list located in the scalar section, wherein maintaining includes; placing each instruction in order in the active list, wherein placing includes initializing each instruction to a speculative status; determining if the speculative status instruction is branch speculative or trap speculative; if the speculative status instruction is neither branch speculative nor trap speculative, checking to see if all scalar operands for the speculative status instruction are present; if all scalar operands for the speculative status instruction are present, moving the speculative status instruction to a “
scalar committed”
status and issuing a scalar commitment notice from the active list to the one or more vector sections;checking to see if all vector operands for the scalar committed status instruction are present; if all vector operands for the scalar committed status instruction are present, moving the scalar committed status instruction to a “
committed”
status;checking to see if all instructions previous to the committed status instruction are completed; and if all instructions previous to the committed status instruction are completed, moving the committed status instruction to a “
graduated”
status; andmaintaining memory consistency between multiple vector memory references and between vector and scalar memory references by guaranteeing no vector store reference can be sent to memory prior to a scalar or vector load that occurs earlier in the program order. - View Dependent Claims (2, 3, 4)
-
-
5. An apparatus comprising:
-
a shared memory; one or more multistream processors (MSPs), wherein each MSP includes a plurality of single stream processors (SSPs) including a first SSP, each one of the plurality of SSPs having a scalar section and one or more vector sections, wherein each of the SSPs is operatively coupled to the memory; means for defining program order between operations on the first SSP; means for defining operation dependence order of vector memory references to the memory with respect to each other and with respect to scalar memory references to the memory; means for maintaining a minimal guarantee on the ordering on the first SSP using an active list located in the scalar section, wherein means for maintaining includes; means for placing each instruction in order in the active list, wherein means for placing includes means for initializing each instruction to a speculative status; means for determining if the speculative status instruction is branch speculative or trap speculative; means for, if the speculative status instruction is neither branch speculative nor trap speculative, checking to see if all scalar operands for the speculative status instruction are present; means for, if all scalar operands for the speculative status instruction are present, moving the speculative status instruction to a “
scalar committed”
status and issuing a scalar commitment notice from the active list to the one or more vector sections;means for checking to see if all vector operands for the scalar committed status instruction are present; means for, if all vector operands for the scalar committed status instruction are present, moving the scalar committed status instruction to a “
committed”
status;means for checking to see if all instructions previous to the committed status instruction are completed; and means for, if all instructions previous to the committed status instruction are completed, moving the committed status instruction to a “
graduated”
status; andmeans for maintaining memory consistency between the plurality of single stream processors (SSPs) by guaranteeing no vector store reference can be sent to memory prior to a scalar or vector load that occurs earlier in the program order. - View Dependent Claims (6, 7, 8)
-
-
9. A computer processing method comprising:
-
providing a memory having a plurality of addressable locations; providing a multistream processor (MSP), wherein the MSP includes a plurality of single stream processors (SSPs) including a first SSP, each one of the plurality of SSPs having a scalar section and one or more vector sections, wherein each of the plurality of SSPs is operatively coupled to the memory; defining program order between operations on the first SSP; defining operation dependence order of vector memory references to the memory with respect to each other and with respect to scalar memory references to the memory; serializing writes to any given one of the plurality of addressable locations of memory in the order using an active list in the scalar section, wherein serializing includes; placing each instruction in order in the active list, wherein placing includes initializing each instruction to a speculative status; determining if the speculative status instruction is branch speculative or trap speculative; if the speculative status instruction is neither branch speculative nor trap speculative, checking to see if all scalar operands for the speculative status instruction are present; if all scalar operands for the speculative status instruction are present, moving the speculative status instruction to a “
scalar committed”
status and issuing a scalar commitment notice from the active list to the one or more vector sections;checking to see if all vector operands for the scalar committed status instruction are present; if all vector operands for the scalar committed status instruction are present, moving the scalar committed status instruction to a “
committed”
status;checking to see if all instructions previous to the committed status instruction are completed; and if all instructions previous to the committed status instruction are completed, moving the committed status instruction to a “
graduated”
status;making a write globally visible when no one of the plurality of SSPs can read the value produced by an earlier write in a sequential order of writes to that location; preventing an SSP from reading a value written by another MSP before that value becomes globally visible; and performing memory consistency between the plurality of single stream processors (SSPs) by guaranteeing no vector store reference can be sent to memory prior to a scalar or vector load that occurs earlier in the program order. - View Dependent Claims (10, 11, 12)
-
-
13. An apparatus comprising:
-
a memory having a plurality of addressable locations; one or more multistream processors (MSPs), wherein each MSP includes a plurality of single stream processors (SSPs) including a first SSP, each one of the plurality of SSPs having a scalar section and one or more vector sections, wherein each of the SSPs is operatively coupled to the memory; means for defining program order between operations on the first SSP; means for defining operation dependence order of vector memory references to the memory with respect to each other and with respect to scalar memory references to the memory; means for serializing writes to any given one of the plurality of addressable locations of memory in the order using an active list in the scalar section, wherein means for serializing includes; means for placing each instruction in order in the active list, wherein means for placing includes means for initializing each instruction to a speculative status; means for determining if the speculative status instruction is branch speculative or trap speculative; means for, if the speculative status instruction is neither branch speculative nor trap speculative, checking to see if all scalar operands for the speculative status instruction are present; means for, if all scalar operands for the speculative status instruction are present, moving the speculative status instruction to a “
scalar committed”
status and issuing a scalar commitment notice from the active list to the one or more vector sections;means for checking to see if all vector operands for the scalar committed status instruction are present; means for, if all vector operands for the scalar committed status instruction are present, moving the scalar committed status instruction to a “
committed”
status;means for checking to see if all instructions previous to the committed status instruction are completed; and means for, if all instructions previous to the committed status instruction are completed, moving the committed status instruction to a “
graduated”
status;means for making a write globally visible when no one of the plurality of SSPs can read the value produced by an earlier write in a sequential order of writes to that location; means for preventing an SSP from reading a value written by another MSP before that value becomes globally visible; and means for performing memory consistency between the plurality of single stream processors (SSPs) by guaranteeing no vector store reference can be sent to memory prior to a scalar or vector load that occurs earlier in the program order. - View Dependent Claims (14, 15, 16)
-
Specification