Low latency data exchange
First Claim
Patent Images
1. A system for exchanging data, the system comprising:
- a main processor in communication with an active memory device, the main processor configured to implement a method comprising;
receiving, at a processing element in the active memory device, an instruction from the main processor;
receiving, at the processing element, a store request from a thread running on the main processor, the store request specifying a memory address associated with the processing element;
storing a value provided in the store request in a queue in the processing element, the storing comprising storing the value in a queue entry identified by a processor head pointer, wherein the processor head pointer moves to a subsequent entry after storing the value in the queue entry; and
performing, by the processing element, the instruction using the value from the queue, wherein storing the value and performing the instruction, in an out-of-order exchange of the data between the processing element and the main processor, are synchronized based on tags assigned to corresponding data, the tags identifying the a location of the corresponding data;
wherein receiving the store request from the thread running on the main processor further comprises receiving a store request from the main processor that bypasses all system cache before it is received by the processing element, and wherein data subject to the store request is exchanged directly between a register in the main processor and the queue in the processing element.
1 Assignment
0 Petitions
Accused Products
Abstract
According to one embodiment, a method for exchanging data in a system that includes a main processor in communication with an active memory device is provided. The method includes a processing element in the active memory device receiving an instruction from the main processor and receiving a store request from a thread running on the main processor, the store request specifying a memory address associated with the processing element. The method also includes storing a value provided in the store request in a queue in the processing element and the processing element performing the instruction using the value from the queue.
-
Citations
17 Claims
-
1. A system for exchanging data, the system comprising:
-
a main processor in communication with an active memory device, the main processor configured to implement a method comprising; receiving, at a processing element in the active memory device, an instruction from the main processor; receiving, at the processing element, a store request from a thread running on the main processor, the store request specifying a memory address associated with the processing element; storing a value provided in the store request in a queue in the processing element, the storing comprising storing the value in a queue entry identified by a processor head pointer, wherein the processor head pointer moves to a subsequent entry after storing the value in the queue entry; and performing, by the processing element, the instruction using the value from the queue, wherein storing the value and performing the instruction, in an out-of-order exchange of the data between the processing element and the main processor, are synchronized based on tags assigned to corresponding data, the tags identifying the a location of the corresponding data; wherein receiving the store request from the thread running on the main processor further comprises receiving a store request from the main processor that bypasses all system cache before it is received by the processing element, and wherein data subject to the store request is exchanged directly between a register in the main processor and the queue in the processing element. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A system for exchanging data, the system comprising:
-
a main processor in communication with an active memory device, the main processor implementing a method comprising; receiving, at a processing element in the active memory device, an instruction from the main processor; receiving, at the processing element, a load request from a thread running on the main processor, the load request specifying a memory address associated with the processing element; placing tag information relating to the load request in a queue in the processing element, the tag information corresponding to the data requested by the load request and describing the thread running on the main processor that will use the requested data; performing, by the processing element, the instruction, wherein performing the instruction, in an out-of-order exchange of the data between the processing element and the main processor, are synchronized based on the tag information, the tag information identifying a location of the corresponding data; placing a result of the instruction in the queue corresponding to the tag information; and communicating, by the processing element, the result to the main processor in response to the load request, wherein communicating the result comprises bypassing all system cache before it is received by the main processor, wherein receiving the load request from the thread running on the main processor further comprises receiving a load request from the main processor that bypasses all system cache before it is received by the processing element, and wherein data subject to the load request is exchanged directly between a register in the main processor and the queue in the processing element. - View Dependent Claims (8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
Specification