Memory-aware scheduling for NUMA architectures
First Claim
1. A system including instructions recorded on a non-transitory computer-readable medium, the system comprising:
- a core list generator configured to generate, for each designated core of a Non-Uniform Memory Access (NUMA) architecture, and based on a topology of the NUMA architecture, a proximity list listing non-designated cores in an order corresponding to latencies for transferring state information stored in association with threads to the non-designated cores from the designated core, the core list generator being configured to generate the proximity list by taking into account a number of, and connections between, a plurality of sockets included in the NUMA architecture, each socket including one or more cores; and
a core selector configured to determine, at a target core and during execution of a plurality of threads, that the target core is executing an insufficient number of the plurality of threads, and further configured to select a source core at the target core, according to the proximity list associated therewith, for subsequent transfer of a transferred thread and the transferred thread'"'"'s associated state information from the selected source core to the target core for execution thereon.
2 Assignments
0 Petitions
Accused Products
Abstract
A topology reader may determine a topology of a Non-Uniform Memory Access (NUMA) architecture including a number of, and connections between, a plurality of sockets, each socket including one or more cores and at least one memory configured to execute a plurality of threads of a software application. A core list generator may generate, for each designated core of the NUMA architecture, and based on the topology, a proximity list listing non-designated cores in an order corresponding to a proximity of the non-designated cores to the designated core. A core selector may determine, at a target core and during the execution of the plurality of threads, that the target core is executing an insufficient number of the plurality of threads, and may select a source core at the target core, according to the proximity list associated therewith, for subsequent transfer of a transferred thread from the selected source core to the target core for execution thereon.
19 Citations
20 Claims
-
1. A system including instructions recorded on a non-transitory computer-readable medium, the system comprising:
-
a core list generator configured to generate, for each designated core of a Non-Uniform Memory Access (NUMA) architecture, and based on a topology of the NUMA architecture, a proximity list listing non-designated cores in an order corresponding to latencies for transferring state information stored in association with threads to the non-designated cores from the designated core, the core list generator being configured to generate the proximity list by taking into account a number of, and connections between, a plurality of sockets included in the NUMA architecture, each socket including one or more cores; and a core selector configured to determine, at a target core and during execution of a plurality of threads, that the target core is executing an insufficient number of the plurality of threads, and further configured to select a source core at the target core, according to the proximity list associated therewith, for subsequent transfer of a transferred thread and the transferred thread'"'"'s associated state information from the selected source core to the target core for execution thereon. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A computer program product, the computer program product being tangibly embodied on a non-transitory computer-readable storage medium and comprising instructions that, when executed, are configured to:
-
generate, for each designated core of a Non-Uniform Memory Access (NUMA) architecture, and based on a topology of the NUMA architecture, a number of, and connections between, a plurality of sockets, each socket including one or more cores, a proximity list listing non-designated cores in an order corresponding to latencies for transferring state information stored in association with threads to the non-designated cores from the designated core; determine, at a target core and during the execution of a plurality of threads, that the target core is executing an insufficient number of the plurality of threads; select a source core at the target core, according to the proximity list associated therewith; and transfer a transferred thread and the transferred thread'"'"'s associated state information from the selected source core to the target core for execution thereon. - View Dependent Claims (17, 18, 19)
-
-
20. A computer-implemented method comprising:
-
generating, for each designated core of a Non-Uniform Memory Access (NUMA) architecture, and based on a topology of the NUMA architecture, a number of, and connections between, a plurality of sockets, each socket including one or more cores, a proximity list listing non-designated cores in an order corresponding to latencies for transferring state information stored in association with threads to the non-designated cores from the designated core; determining, at a target core and during execution of a plurality of threads, that the source target core is executing an insufficient number of the plurality of threads; selecting a source core at the target core, according to the proximity list associated therewith; and transferring a transferred thread and state information associated with the transferred thread from the selected source core to the target core for execution thereon.
-
Specification