Directory-based, shared-memory, scaleable multiprocessor computer system having deadlock-free transaction flow sans flow control protocol
First Claim
1. A multi-processor computer system comprising:
- a global interconnect;
a plurality of n nodes, each node having;
a local interconnect;
at least one processor, said processor being coupled to the local interconnect;
a cache associated with each processor;
a main memory coupled to the local interconnect, said main memory being equally accessible to all processors within its respective node;
a global interface which couples the global interconnect to the local interconnect of its respective node, said global interface including a transaction filter, a tag memory, home agent, a slave agent, and a request agent, said transaction filter routes cache coherency transactions from said local interconnect through a local physical address-to-global address translator to said request agent, said transaction filter routes input/output transactions from said local interconnect through an I/O input queue to said request agent, and said tag memory stores a permission status entry for each of said routed cache coherency transactions and said routed input/output transactions; and
at least one input buffer associated with each home agent and each slave agent and forming a portion of said global interface, each input buffer associated with said each home agent and said each slave agent of each global interface of each of the plurality of n nodes sized to contain a number of storage locations corresponding to at least a maximum number of outstanding transaction requests receivable at each node of the plurality of nodes, the maximum number of outstanding transaction requests being the outstanding transaction requests together issuable by all of said plurality of n nodes.
0 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus are provided which eliminate the need for an active traffic flow control protocol to manage request transaction flow between the nodes of a directory-based, scaleable, shared-memory, multi-processor computer system. This is accomplished by determining the maximum number of requests that any node can receive at any given time, providing an input buffer at each node which can store at least the maximum number of requests that any node can receive at any given time and transferring stored requests from the buffer as the node completes requests in process and is able to process additional incoming requests. As each node may have only a certain finite number of pending requests, this is the maximum number of requests that can be received by a node acting in slave capacity from any another node acting in requester capacity. In addition, each node may also issue requests that must be processed within that node. Therefore, the input buffer must be sized to accommodate not only external requests, but internal ones as well. Thus, the buffer must be able to store at least the maximum number of transaction requests that may be pending at any node, multiplied by the number of nodes present in the system.
-
Citations
17 Claims
-
1. A multi-processor computer system comprising:
-
a global interconnect; a plurality of n nodes, each node having; a local interconnect; at least one processor, said processor being coupled to the local interconnect; a cache associated with each processor; a main memory coupled to the local interconnect, said main memory being equally accessible to all processors within its respective node; a global interface which couples the global interconnect to the local interconnect of its respective node, said global interface including a transaction filter, a tag memory, home agent, a slave agent, and a request agent, said transaction filter routes cache coherency transactions from said local interconnect through a local physical address-to-global address translator to said request agent, said transaction filter routes input/output transactions from said local interconnect through an I/O input queue to said request agent, and said tag memory stores a permission status entry for each of said routed cache coherency transactions and said routed input/output transactions; and at least one input buffer associated with each home agent and each slave agent and forming a portion of said global interface, each input buffer associated with said each home agent and said each slave agent of each global interface of each of the plurality of n nodes sized to contain a number of storage locations corresponding to at least a maximum number of outstanding transaction requests receivable at each node of the plurality of nodes, the maximum number of outstanding transaction requests being the outstanding transaction requests together issuable by all of said plurality of n nodes. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. In a multi-processor computer system having multiple nodes, each node having a block of main memory and multiple microprocessors, each node having a global interface which incorporates a home agent, a slave agent and a request agent, a method for providing the orderly flow of memory request and request compliance traffic between nodes without resorting to complex flow control protocol, said method comprising the steps of:
-
identifying a number y, which represents the maximum number of incomplete transaction requests that any single node may have outstanding, the number y limited to a certain, determinable finite number; multiplying the number y by the number n, which represents the number of nodes within the computer system; providing temporary storage at a buffer of the global interface for at least a number ny of requests at the home agent of each node so that pending requests received by that home agent may be stored until it is able to process them; processing the requests stored at the temporary storage, provided during said step of providing, at the microprocessor; maintaining a status indicator at each node for each received request once processing of that request begins; indicating whether processing of the request is complete or still pending; transferring stored requests as the requests stored during said step of providing are processed; receiving cache coherency transactions and input/output transactions from the multiple microprocessors; routing said cache coherency transactions through a local physical address-to-global address translator to the request agent; routing said input/output transactions through an I/O input queue to the request agent; and storing a permission status entry for each of said routed cache coherency transactions and said routed input/output transactions. - View Dependent Claims (12, 13, 14, 15, 16, 17)
-
Specification