Methods and apparatus for optimal OpenMP application performance on Hyper-Threading processors
First Claim
1. A method for assigning OpenMP software application threads executed by multiple physical processors, each physical processor having at least two logical processors, the method comprising:
- maintaining a global thread count, wherein the global thread count is adapted to reflect the number of active threads being executed by the multiple physical processors;
executing an application parallel region, wherein the application parallel region comprises a plurality of OpenMP software application threads; and
assigning affinity to each of the plurality of OpenMP software application threads if the global thread count is not greater than the number of physical processors, whereby each of the physical processors executes no more than one of the plurality of OpenMP software application threads.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods and apparatus for Optimal OpenMP application performance on Hyper-Threading processors are disclosed. For example, an OpenMP runtime library is provided for use in a computer having a plurality of processors, each architecturally designed with a plurality of logical processors, and Hyper-Threading enabled. The example OpenMP runtime library is adapted to determine the number of application threads requested by an application and assign affinity to each application thread if the total number of executing threads is not greater than the number of physical processors. A global status indicator may be utilized to coordinate the assignment of the application threads.
52 Citations
27 Claims
-
1. A method for assigning OpenMP software application threads executed by multiple physical processors, each physical processor having at least two logical processors, the method comprising:
-
maintaining a global thread count, wherein the global thread count is adapted to reflect the number of active threads being executed by the multiple physical processors;
executing an application parallel region, wherein the application parallel region comprises a plurality of OpenMP software application threads; and
assigning affinity to each of the plurality of OpenMP software application threads if the global thread count is not greater than the number of physical processors, whereby each of the physical processors executes no more than one of the plurality of OpenMP software application threads. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method for assigning OpenMP software application threads executed by multiple physical processors, each physical processor having at least two logical processors, the method comprising:
-
maintaining a global thread count, wherein the global thread count is adapted to reflect the number of active threads being executed by the multiple physical processors;
initializing an application parallel region, wherein the application parallel region comprises a plurality of OpenMP software application threads;
updating the global thread count to reflect the addition of the plurality of OpenMP software application threads;
assigning affinity to each of the plurality of OpenMP software application threads if the global thread count is not greater than the number of physical processors, whereby each physical processor is assigned no more than one of the plurality of OpenMP software application threads;
executing the application parallel region on the physical processors;
terminating the execution of the application parallel region; and
updating the global thread count to reflect the termination of the plurality of OpenMP software application threads. - View Dependent Claims (8, 9, 10, 11, 12, 13)
-
-
14. For use in a computer having a plurality of physical processors executing an application having at least one region comprising a plurality of application threads, an apparatus comprising:
-
a global thread counter, wherein the global thread counter is adapted to reflect the number of application threads being executed by the plurality of physical processors;
a plurality of logical processors, wherein each of the plurality of physical processors comprises at least two logical processors;
an OpenMP runtime library responsive to the execution of the plurality of application threads, the OpenMP runtime library adapted to update the global thread counter with a count of the number of application threads being executed by the plurality of physical processors, and the OpenMP runtime library adapted to assign physical processor affinity to each of the number of application threads being executed by the plurality of physical processors, if the number of application threads being executed by the plurality of physical processors is not greater than the number of physical processors. - View Dependent Claims (15, 16, 17, 18)
-
-
19. A computer-readable storage medium containing a set of instructions for a general purpose computer comprising a plurality of physical processors each physical processor comprising a plurality of logical processors, and a user interface comprising a mouse and a screen display, the set of instructions comprising:
an OpenMP runtime routine operatively associated with the plurality of physical processor to execute a plurality of application instruction threads on the plurality of logical processors, wherein each of the plurality of physical processors executes one application instruction threads if the number of application instruction threads is not greater than the number of plurality of physical processors. - View Dependent Claims (20, 21, 22, 23, 24)
-
25. An apparatus comprising:
-
an input device;
an output device;
a memory; and
a plurality of physical processors, each having a plurality of logical processors, the plurality of physical processors cooperating with the input device, the output device and the memory to substantially simultaneously execute a plurality of application threads on separate physical processors when the number of executing application threads is not greater than the number of physical processors. - View Dependent Claims (26, 27)
-
Specification