Multiprocessor system having distributed shared memory and instruction scheduling method used in the same system
First Claim
1. A multiprocessor system having distributed shared memory, comprising a plurality of nodes connected via an inter-node interface, each of the nodes including at least one processor, a main storage controller, and a main storage belonging to the distributed shared memory with an individual memory address, whereineach processor comprises instruction executing means for executing an NUMA prefetch instruction, the instruction executing means includes an address determiner which terminates the processing of the NUMA prefect instruction without conducting a prefetch operation if an address specified in the NUMA prefetch instruction is associated with a local node, to which the processor belongs, and which issues, only if the address is associated with a remote node other than the local node, a prefetch request to a main storage controller of the local node.
1 Assignment
0 Petitions
Accused Products
Abstract
In multiprocessing system executing processing called NUMA prefetch, when a prefetch instruction is issued to a prefetch unit, an address converter converts an address specified by an operand of the instruction into a physical address. A prefetch type determiner determines whether the instruction is an NUMA prefetch instruction or a conventional perfect prefetch instruction. If the instruction is an NUMA prefetch instruction, an address determiner determines whether the physical address is a local address or a remote address. If the address is a local address, the processing of the prefetch instruction is terminated. If the address is a remote address, a cache tag checker checks a cache. When cache hit occurs, the processing is terminated. When cache mishit occurs, a prefetch request is issued to a main storage controller. As a result, data is prefetched from a remote main storage to a cache in a local main storage.
-
Citations
4 Claims
-
1. A multiprocessor system having distributed shared memory, comprising a plurality of nodes connected via an inter-node interface, each of the nodes including at least one processor, a main storage controller, and a main storage belonging to the distributed shared memory with an individual memory address, wherein
each processor comprises instruction executing means for executing an NUMA prefetch instruction, the instruction executing means includes an address determiner which terminates the processing of the NUMA prefect instruction without conducting a prefetch operation if an address specified in the NUMA prefetch instruction is associated with a local node, to which the processor belongs, and which issues, only if the address is associated with a remote node other than the local node, a prefetch request to a main storage controller of the local node.
-
3. An instruction scheduling method for use in a multiprocessor system having distributed shared memory and comprising a plurality of nodes connected via an inter-node interface, each of the nodes including at least one processor each of which includes a first level cache, a main storage controller, a main storage belonging to the distributed shared memory with an individual memory address, and a second level cache, the method being used in each of the processor, wherein
each of the processor issues a first prefetch instruction specifying an address block containing data to be used by the processor, the first prefetch instruction indicating that when the address specifies a local node to which the processor belongs, the processing of the first prefetch instruction is terminated without conducting a prefetch operation, and only when the address specifies a remote node other than the local note, the data is prefetched from the remote node to the second level cache of the local node, and the processor then issues a second prefetch instruction indicating an address block equal to the address block specified in the first prefetch instruction, the second prefetch instruction indicating that the data is prefetched to the first level cache.
-
4. An instruction scheduling method for use in a multiprocessor system having distributed shared memory and comprising a plurality of nodes connected via an inter-node interface, the node including one or more processors each of which includes a first level cache, a main storage controller, a main storage, and a second level cache, the method being used in each of the processors,
wherein each of the processors issues a first prefetch instruction specifying an address block containing data to be used by the processor, the first prefetch instruction indicating that when the address specifies a local node, a prefetch operation is not conducted, and only when the address specifies a remote node, the data is prefetched from the remote node to the second level cache of the local node, wherein the processor then issues a second prefetch instruction indicating an address block equal to the address block specified in the first prefetch instruction, the second prefetch instruction indicating that the data is prefetched to the first level cache, and wherein the processor issues, after the first prefetch instruction for the address block, the second prefetch instruction for the address block when a period of time fully lapses for termination of execution the first prefetch instruction.
Specification