Mechanism for reducing latency of memory barrier operations on a multiprocessor system
First Claim
1. A method for reducing the latency of a memory barrier (MB) operation used to impose an inter-reference order between sets of memory reference operations issued by a first processor to a multiprocessor system having a plurality of processors and a shared memory interconnected by a system control logic, the method comprising:
- issuing a first set of memory reference operations from the first processor to the system control logic;
issuing the MB operation from the first processor to the system control logic immediately after issuing the first set of memory reference operations without waiting for responses to the first set of memory reference operations to arrive at the first processor;
ordering the first set of memory reference operations with respect to other memory reference operations issued by other processors of the system at an ordering point of a switch;
generating probe and invalidate packets for the ordered first set of memory reference operations at the ordering point;
loading the probe and invalidate packets into probe queues of the first and other processors for transmission to those processors;
ordering the MB operation at the ordering point after ordering of the first set of memory reference operations;
generating a MB acknowledgment (MB-Ack) in response to the ordered MB operation; and
loading the MB-Ack into the probe queue of the first processor for transmission to the first processor, the loaded MB-Ack pulling-in all previously ordered invalidate and probe commands in the probe queue of the first processor.
4 Assignments
0 Petitions
Accused Products
Abstract
A technique reduces the latency of a memory barrier (MB) operation used to impose an inter-reference order between sets of memory reference operations issued by a processor to a multiprocessor system having a shared memory. The technique comprises issuing the MB operation immediately after issuing a first set of memory reference operations (i.e., the pre-MB operations) without waiting for responses to those pre-MB operations. Issuance of the MB operation to the system results in serialization of that operation and generation of a MB Acknowledgment (MB-Ack) command. The MB-Ack is loaded into a probe queue of the issuing processor and, according to the invention, functions to pull-in all previously ordered invalidate and probe commands in that queue. By ensuring that the probes and invalidates are ordered before the MB-Ack is received at the issuing processor, the inventive technique provides the appearance that all pre-MB references have completed.
129 Citations
7 Claims
-
1. A method for reducing the latency of a memory barrier (MB) operation used to impose an inter-reference order between sets of memory reference operations issued by a first processor to a multiprocessor system having a plurality of processors and a shared memory interconnected by a system control logic, the method comprising:
-
issuing a first set of memory reference operations from the first processor to the system control logic; issuing the MB operation from the first processor to the system control logic immediately after issuing the first set of memory reference operations without waiting for responses to the first set of memory reference operations to arrive at the first processor; ordering the first set of memory reference operations with respect to other memory reference operations issued by other processors of the system at an ordering point of a switch; generating probe and invalidate packets for the ordered first set of memory reference operations at the ordering point; loading the probe and invalidate packets into probe queues of the first and other processors for transmission to those processors; ordering the MB operation at the ordering point after ordering of the first set of memory reference operations; generating a MB acknowledgment (MB-Ack) in response to the ordered MB operation; and loading the MB-Ack into the probe queue of the first processor for transmission to the first processor, the loaded MB-Ack pulling-in all previously ordered invalidate and probe commands in the probe queue of the first processor. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
Specification