MEMORY SHARING ACROSS DISTRIBUTED NODES
First Claim
1. In a distributed system comprising a first node and a second node, wherein the first node has a first main memory and the second node has a second main memory, and wherein a second memory location in the second main memory is mirrored in a first memory location in the first main memory, a method performed by the first node, comprising:
- executing, by a processor on the first node, a load instruction to load data from the first memory location of the first main memory, wherein the load instruction is part of a set of program instructions pertaining to a particular thread of execution;
determining, by the processor, whether the data in the first memory location is valid;
in response to a determination that the data in the first memory location is invalid, causing the load instruction to trap, which causes the processor to suspend execution of the set of program instructions and to begin execution of a set of trap handling instructions;
while executing the set of trap handling instructions, the processor causing;
valid data to be obtained from the second memory location of the second main memory, and stored into the first memory location of the first main memory; and
a validity indicator to be updated to indicate that the data in the first memory location is valid; and
resuming, by the processor, execution of the set of program instructions.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and apparatus are disclosed for enabling nodes in a distributed system to share one or more memory portions. A home node makes a portion of its main memory available for sharing, and one or more sharer nodes mirrors that shared portion of the home node'"'"'s main memory in its own main memory. To maintain memory coherency, a memory coherence protocol is implemented. Under this protocol, load and store instructions that target the mirrored memory portion of a sharer node are trapped, and store instructions that target the shared memory portion of a home node are trapped. With this protocol, valid data is obtained from the home node and updates are propagated to the home node. Thus, no “dirty” data is transferred between sharer nodes. As a result, the failure of one node will not cause the failure of another node or the failure of the entire system.
-
Citations
48 Claims
-
1. In a distributed system comprising a first node and a second node, wherein the first node has a first main memory and the second node has a second main memory, and wherein a second memory location in the second main memory is mirrored in a first memory location in the first main memory, a method performed by the first node, comprising:
-
executing, by a processor on the first node, a load instruction to load data from the first memory location of the first main memory, wherein the load instruction is part of a set of program instructions pertaining to a particular thread of execution; determining, by the processor, whether the data in the first memory location is valid; in response to a determination that the data in the first memory location is invalid, causing the load instruction to trap, which causes the processor to suspend execution of the set of program instructions and to begin execution of a set of trap handling instructions; while executing the set of trap handling instructions, the processor causing; valid data to be obtained from the second memory location of the second main memory, and stored into the first memory location of the first main memory; and a validity indicator to be updated to indicate that the data in the first memory location is valid; and resuming, by the processor, execution of the set of program instructions. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. In a distributed system comprising a first node and a second node, wherein the first node has a first main memory and the second node has a second main memory, and wherein a second memory location in the second main memory is mirrored in a first memory location in the first main memory, a method performed by the first node, comprising:
-
executing, by a first processor on the first node, a store instruction to store updated data into the first memory location of the first main memory, wherein the store instruction is part of a set of program instructions pertaining to a particular thread of execution; causing the store instruction to trap, which causes the first processor to suspend execution of the set of program instructions and to begin execution of a set of trap handling instructions; while executing the set of trap handling instructions, the first processor; causing the updated data to eventually be propagated to the second node to be stored within the second memory location of the second main memory; and resuming, by the first processor, execution of the set of program instructions. - View Dependent Claims (8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A first node for use in a distributed computing system, the first node comprising:
-
a first main memory, wherein a first memory location in the first main memory is usable to mirror a second memory location in a second main memory on a second node of the distributed computing system; a set of trap handling instructions; and one or more processors including a first processor, the first processor operable to execute a load instruction to load data from the first memory location of the first main memory, wherein the load instruction is part of a set of program instructions pertaining to a particular thread of execution, and the first processor comprising circuitry operable to determine whether data in the first memory location of the first main memory is valid, and in response to a determination that the data in the first memory location of the first main memory is invalid, to cause the load instruction to trap, which would cause the first processor to suspend execution of the set of program instructions and to begin execution of the set of trap handling instructions; and wherein the set of trap handling instructions, when executed by the first processor, would cause the first processor to cause; valid data to be obtained from the second memory location of the second main memory, and stored into the first memory location of the first main memory; and a validity indicator to be updated to indicate that the data in the first memory location is valid; and execution of the set of program instructions to be resumed. - View Dependent Claims (21, 22, 23, 24)
-
-
25. A first node for use in a distributed computing system, the first node comprising:
-
a first main memory, wherein a first memory location in the first main memory is usable to mirror a second memory location in a second main memory on a second node of the distributed computing system; a set of trap handling instructions; and one or more processors including a first processor, the first processor operable to execute a store instruction to store updated data into the first memory location of the first main memory, wherein the store instruction is part of a set of program instructions pertaining to a particular thread of execution, and wherein the first processor comprises circuitry operable to cause the store instruction to trap, which would cause the first processor to suspend execution of the set of program instructions and to begin execution of the set of trap handling instructions; and wherein the set of trap handling instructions, when executed by the first processor, would cause the first processor to; cause the updated data to eventually be propagated to the second node to be stored within the second memory location of the second main memory; and resume execution of the set of program instructions. - View Dependent Claims (26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36)
-
-
37. In a distributed system comprising a first node and a second node, wherein the first node has a first main memory and the second node has a second main memory, and wherein a second memory location in the second main memory is mirrored in a first memory location in the first main memory, a method performed by the second node, comprising:
-
executing, by a first processor on the second node, a store instruction to store updated data into the second memory location of the second main memory, wherein the store instruction is part of a set of program instructions pertaining to a particular thread of execution; causing the store instruction to trap, which causes the first processor to suspend execution of the set of program instructions and to begin execution of a set of trap handling instructions; while executing the set of trap handling instructions, the first processor; storing the updated data into the second memory location of the second main memory; and resuming, by the first processor, execution of the set of program instructions. - View Dependent Claims (38, 39, 40, 41, 42)
-
-
43. A second node for use in a distributed computing system comprising a first node and the second node, the second node comprising:
-
a second main memory, wherein a second memory location in the second main memory is usable to be mirrored in a first memory location in a first main memory on the first node; a set of trap handling instructions; and one or more processors including a first processor, the first processor operable to execute a store instruction to store updated data into the second memory location of the second main memory, wherein the store instruction is part of a set of program instructions pertaining to a particular thread of execution, and wherein the first processor comprises circuitry operable to cause the store instruction to trap, which would cause the first processor to suspend execution of the set of program instructions and to begin execution of the set of trap handling instructions; and wherein the set of trap handling instructions, when executed by the first processor, would cause the first processor to; store the updated data into the second memory location of the second main memory; and resume execution of the set of program instructions. - View Dependent Claims (44, 45, 46, 47, 48)
-
Specification