Apparatus and method for starvation load balancing using a global run queue in a multiple run queue system
First Claim
1. A method of balancing workload among a plurality of processors in a multiple processor system, the multiple processor system having a plurality of local run queues, each of the plurality of processors being associated with at least one of the plurality of local run queues, the method comprising:
- assigning a thread to one of the plurality of local run queues;
time stamping the thread to produce a time stamp;
determining if a local run queue of the plurality of local run queues to which the thread is assigned has a threshold number of threads present in the local run queue;
determining whether a difference between a current time and the time stamp is larger than a threshold amount, if it is determined that the local run queue has the threshold number of threads present in the local run queue; and
reassigning the thread to a global run queue if the difference between the current time and the time stamp is larger than the threshold amount.
1 Assignment
0 Petitions
Accused Products
Abstract
Apparatus and methods for starvation load balancing using a global run queue in a multiple run queue system. The apparatus includes a controller, memory, initial load balancing device, idle load balancing device, periodic load balancing device, and starvation load balancing device. The apparatus performs initial load balancing, idle load balancing, periodic load balancing and starvation load balancing to ensure that the workloads for the processors of the system are optimally balanced.
128 Citations
36 Claims
-
1. A method of balancing workload among a plurality of processors in a multiple processor system, the multiple processor system having a plurality of local run queues, each of the plurality of processors being associated with at least one of the plurality of local run queues, the method comprising:
-
assigning a thread to one of the plurality of local run queues;
time stamping the thread to produce a time stamp;
determining if a local run queue of the plurality of local run queues to which the thread is assigned has a threshold number of threads present in the local run queue;
determining whether a difference between a current time and the time stamp is larger than a threshold amount, if it is determined that the local run queue has the threshold number of threads present in the local run queue; and
reassigning the thread to a global run queue if the difference between the current time and the time stamp is larger than the threshold amount. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
obtaining a lock of a local run queue to which the thread is assigned;
obtaining a lock of the thread; and
changing a run queue pointer of the thread to identify the global run queue.
-
-
4. The method of claim 1, further comprising dispatching the thread to a next available processor of the plurality of processors.
-
5. The method of claim 1, wherein only unbound threads are reassigned to the global run queue.
-
6. The method of claim 1, wherein each of the plurality of processors services the global run queue.
-
7. The method of claim 1, wherein assigning the thread to one of the plurality of local run queues includes:
-
determining if the thread is bound or unbound;
if the thread is bound, assigning the thread to a local run queue associated with the processor to which the thread is bound; and
if the thread is unbound, performing initial load balancing to assign the thread.
-
-
8. The method of claim 7, wherein initial local balancing includes:
-
performing a search of each of the processors to find an idle processor; and
if an idle processor is found, assigning the thread to a local run queue associated with the idle processor.
-
-
9. The method of claim 8, wherein initial load balancing further includes:
if an idle processor is not found, assigning the thread to the global run queue.
-
10. The method of claim 8, wherein if the the read is unbound and is part of an existing process, the search is limited to processors that are associated with a node to which other threads of the existing process are assigned.
-
11. A computer program product in a computer readable medium for balancing workload amoung a plurality of processors in a multiple processor system, the multiple processor system having a plurality of local run queues, each of the plurality of processors being associated with at least one of the plurality of local run queues, the computer program product comprising:
-
first instructions for assigning a thread to one of the plurality of local run queues;
second instructions for time stamping the thread to produce a time stamp;
third instructions for determining if a local run queue of the plurality of local run queues to which the thread is assigned has a threshold number of threads present in the local run queue;
forth instructions for determining whether a difference between a current time and the time stamp is larger than a threshold amount, if it is determined that the local run queue has the threshold number of threads present in the local run queue; and
fifth instructions for reassigning the thread to a global run queue if the difference between the current time and the time stamp is larger than the threshold amount. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18)
instructions for obtaining a lock of a local run queue to which the thread is assigned;
instructions for obtaining a lock of the thread; and
instructions for changing a run queue pointer of the thread to identify the global run queue.
-
-
14. The computer program product of claim 11, further comprising instructions for dispatching the thread by the next available processor.
-
15. The computer program product of claim 11, wherein the fifth instructions include:
-
instructions for determining if the thread is bound or unbound;
instructions for assigning the thread to a local run queue associated with the processor to which the thread is bound, if the thread is bound; and
instructions for preforming initial load balancing to assign the thread, if the thread is unbound.
-
-
16. The computer program product of claim 15, wherein the instructions for performing initial load balancing include:
-
instructions for performing a search of each of the processors to find an idle processor; and
instructions for assigning the thread to a local run queue associated with the idle processor, if an idle processor if found.
-
-
17. The computer program product of claim 16, wherein the instructions for performing initial load balancing further include:
instructions for assigning the thread to the global run queue, if an idle processor is not found.
-
18. The computer program product of claim 16, wherein if the thread is unbound and is part of an existing process, the search is limited to processors that are associated with a node to which other threads of the existing process are assigned.
-
19. A method of balancing workload amoung a plurality of processors in a multiple processor system, the multiple processor system including a plurality of local run queues and at least one global run queue comprising:
-
assigning threads to the plurality of local run queues;
determining if a local run queue of the plurality of local run queues has a threshold number of threads present in the local run queues;
determining if one or more threads in the local run queue have not been dispatched within a predetermined period of time, if it is determined that the local run queue has the threshold number of threads present in the local run queue; and
moving the one or more threads from the local run queues to the at least one global run queue, if it is determined that the one or more threads have not been dispatched within the predetermined period of time. - View Dependent Claims (20, 21, 22, 23, 24)
obtaining a lock of a local run queue to which the one or more threads are assigned;
obtaining a lock of the one or more threads; and
changing a run queue pointer of the one or more threads to identify the at least one global run queue.
-
-
23. The method of claim 19, further comprising dispatching the one or more threads by a next available processor in the plurality of processors.
-
24. The method of claim 19, wherein only unbound threads are moved to the at least one global run queue.
-
25. A workload balancing apparatus for balancing workload amoung a plurality of processors in a multiple processor system, the multiple processor system having a plurality of local run queues, each of the plurality of processors being associated with at least one of the plurality of local run queues, the apparatus comprising:
-
an initial load balancing device; and
a starvation load balancing device, wherein the initial load balancing device assigns a thread to one of the plurality of local run queues, and wherein the starvation load balancing device determines if a local run queue of the plurality of local run queues to which the thread is assigned has a threshold number of threads present in the local run queue, determines whether the thread has not been dispatched within a determined period of time, if it is determined that the local run queue has the threshold number of threads present in the local run queue, and reassigns the thread to a global run queue if the thread has not been dispatched within the determined period of time. - View Dependent Claims (26, 27, 28, 29, 30)
obtaining a lock of a local run queue to which the thread is assigned;
obtaining a lock of the thread; and
changing a run queue pointer of the thread to identify the global run queue.
-
-
29. The apparatus of claim 25, wherein the thread is dispatched by a next available processor of the plurality of processors.
-
30. The apparatus of claim 25, wherein the starvation load balancing device only reassigns unbound threads to the global run queue.
-
31. A multiple processor system, comprising:
-
a plurality of processors; and
a dispatcher, wherein the plurality of processors are organized into at least one node, and wherein the at last one node has an associated global run queue and each of the plurality of processors has an associated local run queue, and wherein the dispatcher assigns threads to the plurality of local run queue, determines if local run queues of the plurality of local run queues to which threads are assigned have a threshold number of threads present, determines whether threads in the local run queues have not been dispatched within a determined period of time, if it is determined that the local run queues have the threshold number of threads present, and reassigns the threads to a global run queue if the threads have not been dispatched within the determined period of time. - View Dependent Claims (32, 33, 34, 35, 36)
obtaining a lock of a local run queue to which the thread is assigned;
obtaining a lock of the thread; and
changing a run queue pointer of the thread to identify the global run queue.
-
-
35. The system of claim 31, wherein the reassigned threads are dispatched by a next available processor of the plurality of processors.
-
36. The system of claim 31, wherein the dispatcher only reassigns unbound threads to the global run queue.
Specification