Low occupancy protocol for managing concurrent transactions with dependencies
First Claim
1. A multi-processing system comprising a shared memory and a plurality of multi-processor nodes coupled via a switch, each of plurality of the multi-processor nodes further comprising at least one processor, the multi-processing system comprising:
- a portion of the shared memory located in each multi-processor node and apportioned into a plurality of blocks;
a directory in each node having a plurality of entries corresponding in number to the plurality of blocks of the shared memory, each entry in the directory for identifying which of the plurality of multi-processor nodes stores copies of the data block; and
a serialization point coupled to the directory for ordering accesses to the plurality of blocks thereby allowing the multi-processing system to concurrently execute multiple references to each of the plurality of blocks.
4 Assignments
0 Petitions
Accused Products
Abstract
An architecture and coherency protocol for use in a large SMP computer system includes a hierarchical switch structure which allows for a number of multi-processor nodes to be coupled to the switch to operate at an optimum performance. Within each multi-processor node, a simultaneous buffering system is provided that allows all of the processors of the multi-processor node to operate at peak performance. A memory is shared among the nodes, with a portion of the memory resident at each of the multi-processor nodes. Each of the multi-processor nodes includes a number of elements for maintaining memory coherency, including a victim cache, a directory and a transaction tracking table. The victim cache allows for selective updates of victim data destined for memory stored at a remote multi-processing node, thereby improving the overall performance of memory. Memory performance is additionally improved by including, at each memory, a delayed write buffer which is used in conjunction with the directory to identify victims that are to be written to memory. An arb bus coupled to the output of the directory of each node provides a central ordering point for all messages that are transferred through the SMP. The messages comprise a number of transactions, and each transaction is assigned to a number of different virtual channels, depending upon the processing stage of the message. The use of virtual channels thus helps to maintain data coherency by providing a straightforward method for maintaining system order. Using the virtual channels and the directory structure, cache coherency problems that would previously result in deadlock may be avoided.
-
Citations
34 Claims
-
1. A multi-processing system comprising a shared memory and a plurality of multi-processor nodes coupled via a switch, each of plurality of the multi-processor nodes further comprising at least one processor, the multi-processing system comprising:
-
a portion of the shared memory located in each multi-processor node and apportioned into a plurality of blocks; a directory in each node having a plurality of entries corresponding in number to the plurality of blocks of the shared memory, each entry in the directory for identifying which of the plurality of multi-processor nodes stores copies of the data block; and a serialization point coupled to the directory for ordering accesses to the plurality of blocks thereby allowing the multi-processing system to concurrently execute multiple references to each of the plurality of blocks. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A method for allowing multiple references to a common block in a shared memory to be executing simultaneously in a multi-processing system, the multi-processing system comprising a plurality of multi-processor nodes coupled via a switch, each of plurality of the multi-processor nodes further comprising at least one processor, a portion of the shared memory apportioned into a plurality of blocks and a serialization unit, the serialization unit comprising a plurality of entries corresponding in number to the plurality of blocks of the portion of shared memory, the method comprising the steps of:
-
ordering all references to the common block as they are received at the serialization unit of multi-processor node associated with the common block, where each reference visits the serialization unit only once during execution; and delaying completion of references to the common block, the common block stored at a destination, until a desired version of the block of shared memory is returned to the destination. - View Dependent Claims (21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34)
-
Specification