ATOMIC-OPERATION COALESCING TECHNIQUE IN MULTI-CHIP SYSTEMS
First Claim
1. A processor, comprising:
- processing elements;
caches associated with respective ones of the processing elements; and
a communication channel coupled to the processing elements and to memory, the memory including an address space that is shared by the processing elements;
wherein, prior to executing an atomic operation, a first processing element determines if data associated with the atomic operation is stored in a first cache associated with the first processing element, the data having a state defined by a cache-coherence protocol;
wherein, if the first processing element determines that the data is not stored in the first cache, the first processing element sends a request including the atomic operation for the data on via the communication channel to at least one of the other processing elements; and
wherein, if the data having the state is stored in a second cache associated with a second processing element, the second processing element executes the atomic operation.
1 Assignment
0 Petitions
Accused Products
Abstract
A cache-coherence protocol distributes atomic operations among multiple processors (or processor cores) that share a memory space. When an atomic operation that includes an instruction to modify data stored in the shared memory space is directed to a first processor that does not have control over the address(es) associated with the data, the first processor sends a request, including the instruction to modify the data, to a second processor. Then, the second processor, which already has control of the address(es), modifies the data. Moreover, the first processor can immediately proceed to another instruction rather than waiting for the address(es) to become available.
72 Citations
46 Claims
-
1. A processor, comprising:
-
processing elements; caches associated with respective ones of the processing elements; and a communication channel coupled to the processing elements and to memory, the memory including an address space that is shared by the processing elements; wherein, prior to executing an atomic operation, a first processing element determines if data associated with the atomic operation is stored in a first cache associated with the first processing element, the data having a state defined by a cache-coherence protocol; wherein, if the first processing element determines that the data is not stored in the first cache, the first processing element sends a request including the atomic operation for the data on via the communication channel to at least one of the other processing elements; and wherein, if the data having the state is stored in a second cache associated with a second processing element, the second processing element executes the atomic operation. - View Dependent Claims (2, 3, 4, 5, 6, 7, 10, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25)
-
-
8. (canceled)
-
9. (canceled)
-
11. (canceled)
-
12. (canceled)
-
13. (canceled)
-
14. (canceled)
-
15. (canceled)
-
26. (canceled)
-
27. (canceled)
-
28. (canceled)
-
29. A computer system, comprising:
-
processing elements; caches associated with respective ones of the processing elements; a communication channel coupled to the processing elements; and memory coupled to the communication channel, the memory including an address space that is shared by the processing elements; wherein, prior to executing an atomic operation, a first processing element determines if data associated with the atomic operation is stored in a first cache associated with the first processing element, the data having a state defined by a cache-coherence protocol; wherein, if the first processing element determines that the data is not stored in the first cache, the first processing element sends a request including the atomic operation for the data on via the communication channel to at least one of the other processing elements; and wherein, if the data having the state is stored in a second cache associated with a second processing element, the second processing element executes the atomic operation.
-
-
30. A method for processing an atomic operation in a processor that includes multiple processing elements and associated caches, which reside in a shared address space, comprising:
-
assigning the atomic operation to a first processing element, wherein, prior to executing an atomic operation, the first processing element performs one or more actions, including; determining if data associated with the atomic operation is stored in a first cache associated with the first processing element, the data having a state defined by a cache-coherence protocol; and
,if the first processing element determines that the data is not stored in the first cache, providing a request including the atomic operation for the data on a communication channel in the processor, wherein the communication channel couples the processing elements to at least one of the other processing elements; and wherein a second processing element performs one or more additional actions, including; determining if the data associated with the state is stored in a second cache associated with the second processing element; and if so, executing the atomic operation. - View Dependent Claims (31, 32, 33, 34, 35, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46)
-
-
36. (canceled)
Specification