Remote Asynchronous Data Mover
First Claim
1. A data processing system having a first processing node comprising:
- at least one processor which executes processes of a first task of multiple tasks within a parallel job, including at least one second task executing on a first remote processing node;
a memory communicatively coupled to the at least one processor and having a plurality of physical storage locations having associated real addresses (RAs) for storing data;
wherein one or more effective addresses of a process of the first task executing on the at least one processor is mapped to a RA of at least one physical storage location in a first remote memory of the first remote processing node;
a remote asynchronous data mover (RADM Mover) associated with the at least one processor and which, when triggered by processor execution of a remote asynchronous data move (RADM) instruction/request, which moves data to/from a first effective address (EA) that is memory mapped to a RA of a physical storage location in the remote memory, performs the following function;
initiate a remote asynchronous data move (RADM) operation that moves a copy of the data from/to the first remote memory by;
(a) retrieving from a global resource manager (GRM) identifying information indicating which remote processing node among multiple remote processing nodes has the first remote memory in which the EA of the RADM instruction is mapped to a RA;
(b) completing a user-level virtual move of data at the at least one processor by utilizing a source EA and a destination EA within the RADM instruction; and
(c) triggering a completion of a physical move of the data at the first remote memory utilizing network interface controllers of the first remote processing node and a second node involved with the RADM operation, wherein the first remote node and second node are identified by respective node IDs (NIDs) retrieved from the GRM, and wherein a physical move of the copy of the data occurs concurrent with other ongoing processing on the at least one processor.
2 Assignments
0 Petitions
Accused Products
Abstract
A distributed data processing system executes multiple tasks within a parallel job, including a first local task on a local node and at least one task executing on a remote node, with a remote memory having real address (RA) locations mapped to one or more of the source effective addresses (EA) and destination EA of a data move operation initiated by a task executing on the local node. On initiation of the data move operation, remote asynchronous data move (RADM) logic identifies that the operation moves data to/from a first EA that is memory mapped to an RA of the remote memory. The local processor/RADM logic initiates a RADM operation that moves a copy of the data directly from/to the first remote memory by completing the RADM operation using the network interface cards (NICs) of the source and destination processing nodes, determined by accessing a data center for the node IDs of remote memory.
-
Citations
20 Claims
-
1. A data processing system having a first processing node comprising:
-
at least one processor which executes processes of a first task of multiple tasks within a parallel job, including at least one second task executing on a first remote processing node; a memory communicatively coupled to the at least one processor and having a plurality of physical storage locations having associated real addresses (RAs) for storing data; wherein one or more effective addresses of a process of the first task executing on the at least one processor is mapped to a RA of at least one physical storage location in a first remote memory of the first remote processing node; a remote asynchronous data mover (RADM Mover) associated with the at least one processor and which, when triggered by processor execution of a remote asynchronous data move (RADM) instruction/request, which moves data to/from a first effective address (EA) that is memory mapped to a RA of a physical storage location in the remote memory, performs the following function; initiate a remote asynchronous data move (RADM) operation that moves a copy of the data from/to the first remote memory by; (a) retrieving from a global resource manager (GRM) identifying information indicating which remote processing node among multiple remote processing nodes has the first remote memory in which the EA of the RADM instruction is mapped to a RA; (b) completing a user-level virtual move of data at the at least one processor by utilizing a source EA and a destination EA within the RADM instruction; and (c) triggering a completion of a physical move of the data at the first remote memory utilizing network interface controllers of the first remote processing node and a second node involved with the RADM operation, wherein the first remote node and second node are identified by respective node IDs (NIDs) retrieved from the GRM, and wherein a physical move of the copy of the data occurs concurrent with other ongoing processing on the at least one processor. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. In a data processing system having at least one processor which executes processes of a first task of multiple tasks within a parallel job, including at least one second task executing on a first remote processing node and a memory communicatively coupled to the at least one processor and having a plurality of physical storage locations having associated real addresses (RAs) for storing data, wherein one or more effective addresses of a process of the first task executing on the at least one processor is mapped to a RA of at least one physical storage location in a first remote memory of the first remote processing node, a method comprising:
-
receiving information related to a remote asynchronous data move (RADM) instruction/request, which moves data to/from a first effective address (EA) that is memory mapped to a RA of a physical storage location in the remote memory; and initiate a remote asynchronous data move (RADM) operation that moves a copy of the data from/to the first remote memory by; (a) retrieving from a global resource manager (GRM) identifying information indicating which remote processing node among multiple remote processing nodes has the first remote memory in which the EA of the RADM instruction is mapped to a RA; (b) completing a user-level virtual move of data at the at least one processor by utilizing a source EA and a destination EA within the RADM instruction; and (c) triggering a completion of a physical move of the data at the first remote memory utilizing network interface controllers of the first remote processing node and a second node involved with the RADM operation, wherein the first remote node and second node are identified by respective node IDs (NIDs) retrieved from the GRM, and wherein a physical move of the copy of the data occurs concurrent with other ongoing processing on the at least one processor. - View Dependent Claims (10, 11, 12, 13, 14, 15, 18, 20)
-
-
16. An article of manufacture embodied as a computer program product with program code configured for execution within a data processing system having at least one processor which executes processes of a first task of multiple tasks within a parallel job, including at least one second task executing on a first remote processing node and a memory communicatively coupled to the at least one processor and having a plurality of physical storage locations having associated real addresses (RAs) for storing data, wherein one or more effective addresses of a process of the first task executing on the at least one processor is mapped to a RA of at least one physical storage location in a first remote memory of the first remote processing node, the program code comprising code for:
-
receiving at a remote asynchronous data mover (RADM Mover) associated with the at least one processor, information related to a remote asynchronous data move (RADM) instruction/request, which moves data to/from a first effective address (EA) that is memory mapped to a RA of a physical storage location in the remote memory; and initiate a remote asynchronous data move (RADM) operation that moves a copy of the data from/to the first remote memory by; (a) retrieving from a global resource manager (GRM) identifying information indicating which remote processing node among multiple remote processing nodes has the first remote memory in which the EA of the RADM instruction is mapped to a RA; (b) completing a user-level virtual move of data at the at least one processor by utilizing a source EA and a destination EA within the RADM instruction; and (c) triggering a completion of a physical move of the data at the first remote memory utilizing network interface controllers of the first remote processing node and a second node involved with the RADM operation, wherein the first remote node and second node are identified by respective node IDs (NIDs) retrieved from the GRM, and wherein a physical move of the copy of the data occurs concurrent with other ongoing processing on the at least one processor. - View Dependent Claims (17, 19)
-
Specification