High performance recoverable communication method and apparatus for write-only networks
First Claim
1. A method for transferring data between a plurality of nodes coupled to a common system data link comprising the steps of:
- apportioning a memory at each of the plurality of nodes into local and shared portions, wherein the shared portions of each of the plurality of nodes together form a shared memory having a plurality of entries, with each entry accessible by each node of said plurality of nodes;
apportioning the local portion of the memory to provide a reflective memory portion, wherein write operations to an address of data stored in the reflective portion of local memory are broadcast on the data link;
storing a control structure in the reflective portion of the local memory of one of the nodes for each item of memory data to be shared by the one of the nodes;
the one of said plurality of nodes sharing the item of memory data updating said item of memory data by issuing to the data link a write command, the write command followed by an acknowledgment command; and
responsive to a receipt of said acknowledgment command by at least one of the other ones of the plurality of nodes sharing said item of memory data the at least one of the other ones of the plurality of nodes issuing an acknowledgment response for writing said corresponding entry in said control structure to indicate a receipt of said write command and said acknowledgment command.
3 Assignments
0 Petitions
Accused Products
Abstract
A multi-node computer network includes a plurality of nodes coupled together via a data link. Each of the nodes includes a local memory, which further comprises a shared memory. Certain items of data that are to be shared by the nodes are stored in the shared portion of memory. Associated with each of the shared data items is a data structure. When a node sharing data with other nodes in the system seeks to modify the data, it transmits the modifications over the data link to the other nodes in the network. Each update is received in order by each node in the cluster. As part of the last transmission by the modifying node, an acknowledgement request is sent to the receiving nodes in the cluster. Each node that receives the acknowledgment request returns an acknowledgement to the sending node. The returned acknowledgement is written to the data structure associated with the shared data item. If there is an error during the transmission of the message, the receiving node does not transmit an acknowledgement, and the sending node is thereby notified that an error has occurred.
-
Citations
11 Claims
-
1. A method for transferring data between a plurality of nodes coupled to a common system data link comprising the steps of:
-
apportioning a memory at each of the plurality of nodes into local and shared portions, wherein the shared portions of each of the plurality of nodes together form a shared memory having a plurality of entries, with each entry accessible by each node of said plurality of nodes; apportioning the local portion of the memory to provide a reflective memory portion, wherein write operations to an address of data stored in the reflective portion of local memory are broadcast on the data link; storing a control structure in the reflective portion of the local memory of one of the nodes for each item of memory data to be shared by the one of the nodes; the one of said plurality of nodes sharing the item of memory data updating said item of memory data by issuing to the data link a write command, the write command followed by an acknowledgment command; and responsive to a receipt of said acknowledgment command by at least one of the other ones of the plurality of nodes sharing said item of memory data the at least one of the other ones of the plurality of nodes issuing an acknowledgment response for writing said corresponding entry in said control structure to indicate a receipt of said write command and said acknowledgment command. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A computer system comprising a plurality of nodes coupled by an interconnect, wherein each of the plurality of nodes comprises:
a memory, the memory comprising; a shared portion, wherein the shared portions of each of the plurality of nodes together forms a shared memory of the computer system, the shared memory for storing data accessible by at least two of the plurality of nodes; and a local portion, accessible to only the associated node of the plurality of nodes, wherein the local portion includes a reflective portion, wherein writes to the reflective portion of the local portion are forwarded to other ones of the plurality of nodes over the interconnect; and a data structure, stored in the reflective portion of the memory, the data structure associated with an item of data stored in the shared memory, the data structure comprising a number of entries corresponding to a number of the plurality of nodes sharing the item of data, wherein each of the entries includes a bit for indicating whether commands directed to the item of data have been received without error at the corresponding node. - View Dependent Claims (8, 9)
-
10. A method for transferring data between a plurality of nodes coupled to an interconnect, wherein each of the plurality of nodes includes a memory apportioned into a local portion, accessible to only the associated node, and a shared portion, accessible by at least two of the plurality of nodes, the method including the steps of:
-
storing, in a reflective portion of the local memory of each of the plurality of nodes, a control structure for each item of data to be shared by the respective one of the plurality of nodes, the control structure including information for indicating whether commands, transferred between the respective one of the plurality of nodes and the shared memory, are received by other ones of the plurality of nodes that share the item of data; and issuing, by one of said plurality of nodes, a command to modify data in said shared memory, wherein the command is issued over the interconnect; issuing an acknowledgment request over the interconnect to request an acknowledgment from at least one of the plurality of nodes that share the data to indicate that the at least one node received the command. - View Dependent Claims (11)
-
Specification