Apparatus and method for improved CPU affinity in a multiprocessor system
First Claim
1. A data processing system for simultaneously executing a plurality of processing tasks, the system comprising:
- a plurality of processors, each processor having first cache means;
a plurality of second cache means, each second cache means being connected to a subset of the processors;
shared memory means connected to each second cache means; and
means for retaining a plurality of run queues, including a plurality of Level 0 run queues, each Level 0 run queue being associated with one of the processors and containing the processing tasks currently affined to its associated processor;
a plurality of Level 1 run queues, each Level 1 run queue being associated with one of the subsets of processors and containing the processing tasks currently affined to its associated subset of processors, and a Level 2 run queue associated with all processors and containing the processing tasks currently affined to all processors in the system, each processing task being included in only one of the run queues.
1 Assignment
0 Petitions
Accused Products
Abstract
Closely related processing threads within a process in a multiprocessor system are collected into thread groups which are globally scheduled as a group based on the thread group structure'"'"'s priority and scheduling parameters. The thread group structure maintains collective timeslice and CPU accounting for all threads in the group. Within each thread group, each individual thread has a local scheduling priority for scheduling among the threads in its group. The system utilizes a hierarchy of processing levels and run queues to facilitate affining thread groups with processors or groups of processors when possible. The system will tend to balance out the workload among system processors and will migrate threads groups up and down through processing levels to increase cache hits and overall performance. The system is periodically reset to avoid long term unbalanced operation conditions.
529 Citations
32 Claims
-
1. A data processing system for simultaneously executing a plurality of processing tasks, the system comprising:
-
a plurality of processors, each processor having first cache means; a plurality of second cache means, each second cache means being connected to a subset of the processors; shared memory means connected to each second cache means; and means for retaining a plurality of run queues, including a plurality of Level 0 run queues, each Level 0 run queue being associated with one of the processors and containing the processing tasks currently affined to its associated processor;
a plurality of Level 1 run queues, each Level 1 run queue being associated with one of the subsets of processors and containing the processing tasks currently affined to its associated subset of processors, and a Level 2 run queue associated with all processors and containing the processing tasks currently affined to all processors in the system, each processing task being included in only one of the run queues. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. In a multiprocessor system having a shared memory accessible to all processors and a plurality of secondary cache memories, each secondary cache memory being accessible to a subset of the processors, each processor having an associated Level 0 run queue containing the processing tasks affined with that CPU, each subset of processors having an associated Level 1 run queue containing the processing tasks affined with that subset of processors and all processors sharing a Level 2 run queue containing the processing tasks affined with all processors in the system, a processing task containing one or more processing threads:
- a method of selecting the next processing task to be executed by an available processor comprising the steps of;
a) checking the Level 0 run queue of the available processor, the Level 1 run queue of the available processor'"'"'s subset of processors and the Level 2 run queue for available tasks; b) if one or more available tasks are located at step a, selecting one of the available tasks for execution, c) if no available tasks are located at step a, checking the Level 0 run queue of one of the other processors in the the available processor'"'"'s processor subset, d) if one or more available tasks are located at step c, selecting a task from the one or more available tasks; e) repeating steps c and d for each other processor in the available processor'"'"'s subset; f) if no available task is located at steps c-e, checking the Level 1 run queue of one of the other processor groups in the system; g) if one or more available tasks are located at step f, selecting one of the available tasks for execution; h) repeating step f and g for each other Level 1 run queue in the system i) if no available task is located at step f-h, checking the Level 0 run queue of one the processors in one of the other processor subsets; j) if one or more available task are located at step i, selecting the thread group for execution, k) repeating steps i and j for each processor in each other processor group in the system, l) if no available task is located at steps i-k, running an idle loop in the available processor. - View Dependent Claims (8, 9, 10, 11)
- a method of selecting the next processing task to be executed by an available processor comprising the steps of;
-
12. A multiprocessor system capable of executing a plurality of processing threads, the system comprising:
- a plurality of processors, memory means connected to and shared by the plurality of processors, one or more processes executing in the system, each process containing one or more processing thread groups, each thread group containing at least a thread group structure, and wherein each thread group has an associated priority level and each thread has an associated priority level, the priority level of each thread being independent of the priority level of its associated thread group.
-
13. A multiprocessor system capable of executing a plurality of processing threads, the system comprising:
- a plurality of processors, memory means connected to and shared by the plurality of processors, one or more processes executing in the system, each process containing one or more processing thread groups, each thread group containing at least a thread group structure, and wherein at least one thread within a thread group has means for creating a new thread within its thread group.
-
14. A multiprocessor system capable of executing a plurality of processing threads, the system comprising:
- a plurality of processors, memory means connected to and shared by the plurality of processors, one or more processes executing in the system, each process containing one or more processing thread groups, each thread group containing at least a thread group structure, and wherein at least one thread within a thread group has means for creating a new thread in a new thread group.
- View Dependent Claims (15, 16, 17, 18)
-
19. A multiprocessor system capable of executing a plurality of processing threads, the system comprising:
- a plurality of processors, memory means connected to and shared by the plurality of processors, one or more processes executing in the system, each process containing one or more processing thread groups, each thread group containing at least a thread group structure, and wherein the thread group structure maintains the scheduling policy and priority for use in scheduling the thread group.
-
20. A multiprocessor system capable of executing a plurality of processing threads, the system comprising:
- a plurality of processors, memory means connected to and shared by the plurality of processors, one or more processes executing in the system, each process containing one or more processing thread groups, each thread group containing at least a thread group structure, and wherein the thread group structure maintains the cumulative timeslice information for all threads within that thread group.
-
21. A multiprocessor system capable of executing a plurality of processing threads, the system comprising:
- a plurality of processors, memory means connected to and shared by the plurality of processors, one or more processes executing in the system, each process containing one or more processing thread groups, each thread group containing at least a thread group structure, and wherein the thread group structure maintains the processor accounting information for all threads within that thread group.
-
22. A multiprocessor system capable of executing a plurality of processing threads, the system comprising:
- a plurality of processors, memory means connected to and shared by the plurality of processors, one or more processes executing in the system, each process containing one or more processing thread groups, each thread group containing at least a thread group structure, and wherein the thread group structure maintains a thread group ID and each thread maintains a thread ID.
-
23. A multiprocessor system capable of executing a plurality of processing threads, the system comprising:
- a plurality of processors, memory means connected to and shared by the plurality of processors, one or more processes executing in the system, each process containing one or more processing thread groups, each thread group containing at least a thread group structure, and wherein each process contains (a) at least one thread group table containing pointers to thread group structures in the process and (b) at least one thread table containing pointers to threads within the thread groups.
- View Dependent Claims (24)
-
25. A multiprocessor system capable of executing a plurality of processing threads, the system comprising:
- a plurality of processors, memory means connected to and shared by the plurality of processors, one or more processes executing in the system, each process containing one or more processing thread groups, each thread group containing at least a thread group structure, and wherein the thread group structure contains a list of processors by which that thread group may be executed.
-
26. A multiprocessor system capable of executing a plurality of processing threads, the system comprising:
- a plurality of processors, memory means connected to and shared by the plurality of processors, one or more processes executing in the system, each process containing one or more processing thread groups, each thread group containing at least a thread group structure, and wherein the thread group structure contains an indicator of the minimum allowable run queue level at which that thread group may be affined.
- View Dependent Claims (27, 28)
-
29. In a multiprocessor system having a shared memory accessible to all processors;
- a plurality of secondary cache memories, each secondary cache memory being accessible to a subset of the processors;
a plurality of Level 0 run queues, each Level 0 run queue being associated with one processor and containing the processing tasks affined with that CPU;
a plurality of Level 1 run queues, each Level 1 run queue being associated with a subset of processors and containing the processing tasks affined with that subset of processors;
a Level 2 run queue associated with all processors and containing the processing tasks affined with all processors in the system; and
wherein each task is associated with a run queue indicator indicating the minimum run queue level at which that task may be affined;
a method of determining if a task should be moved from the run queue where the task currently resides to another run queue in the system when the task is selected for running by an available processor, the method comprising the steps of;a) if the selected task is currently in the Level 2 run queue and the selected task is currently being run by another processor and the other processor currently running the task is a member of the same subset of processors to which the available processor belongs and the task can be affined at Level 1, moving the selected task to the Level 1 run queue of the available processor; b) if the selected task is currently in the Level 2 run queue and the selected task is not currently being run by another processor and the task can be affined at Level 0, moving the selected task to the Level 0 run queue of the available processor; c) if the selected task is currently in the Level 2 run queue and the selected task is not currently being run by another processor and the task can be affined at Level 1 and the task cannot be affined at Level 0, moving the selected task to the Level 1 run queue of the available processor; d) if the selected task is currently in the Level 1 run queue of the available processor and the selected task is not currently being run by another processor and the selected task can be affined at Level 0, moving the selected task to the Level 0 run queue of the available processor; e) if the selected task is currently in a Level 1 run queue which is not the Level 1 run queue of the available processor and the task is not currently being run by another processor and the task can be affined at Level 0, moving the task to the Level 0 run queue of the available processor; f) if the selected task is currently in a Level 1 run queue which is not the Level 1 run queue of the available processor and the task is not currently being run by another processor and the task cannot be affined at Level 0, moving the task to the Level 1 run queue associated with the available processor; g) if the selected task is currently in a Level 1 run queue which is not the Level 1 run queue of the available processor and the task is currently being run by another processor, moving the task to the Level 2 run queue; h) if the selected task is currently in the Level 0 run queue of another processor and the other processor is currently running the selected task and the other processor is not a member of the same subset of processors to which the available processor belongs, moving the selected task to the Level 2 run queue; i) if the selected task is currently in the Level 0 run queue of another processor and the other processor is currently running the selected task and the other processor is a member of the same subset of processors to which the available processor belongs, moving the selected task to the Level 1 run queue of the available processor; and j) if the selected task is currently in the Level 0 run queue of another processor and the other processor is not currently running the selected task, moving the selected task to the Level 0 run queue of the available processor.
- a plurality of secondary cache memories, each secondary cache memory being accessible to a subset of the processors;
-
30. In a multiprocessor system having a shared memory accessible to all processors;
- a plurality of secondary cache memories, each secondary cache memory being accessible to a subset of the processors;
a plurality of Level 0 run queues, each Level 0 run queue being associated with one processor and containing the processing tasks affined with that CPU;
a plurality of Level 1 run queues, each Level 1 run queue being associated with a subset of processors and containing the processing tasks affined with that subset of processors;
a Level 2 run queue associated with all processors and containing the processing tasks affined with all processors in the system; and
wherein each task is associated with a run queue indicator indicating the minimum run queue level at which that task may be affined;
a method of determining if a task in a Level 0 run queue should be moved to another run queue in the system when the task is selected for running by an available processor, the method comprising the steps of;a) if the selected task is not currently being run by another processor and the selected task is not in the Level 0 run queue of the available processor, moving the selected task to the Level 0 run queue of the available processor; b) if the selected task is currently being run by another processor and the other processor currently running the task is a member of the same subset of processors to which the available processor belongs, moving the selected task to the Level 1 run queue of the available processor; and c) if the selected task is currently being run by another processor and the other processor currently running the task is not a member of the same subset of processors to which the available processor belongs, moving the selected task to the Level 2 run queue.
- a plurality of secondary cache memories, each secondary cache memory being accessible to a subset of the processors;
-
31. In a multiprocessor system having a shared memory accessible to all processors;
- a plurality of secondary cache memories, each secondary cache memory being accessible to a subset of the processors;
a plurality of Level 0 run queues, each Level 0 run queue being associated with one processor and containing the processing tasks affined with that CPU;
a plurality of Level 1 run queues, each Level 1 run queue being associated with a subset of processors and containing the processing tasks affined with that subset of processors;
a Level 2 run queue associated with all processors and containing the processing tasks affined with all processors in the system; and
wherein each task is associated with a run queue indicator indicating the minimum run queue level at which that task may be affined;
a method of determining if a task in a Level 1 run queue should be moved to another run queue in the system when the task is selected for running by an available processor, the method comprising the steps of;a) if the selected task can be affined at Level 0 and is not currently being run by another processor, moving the selected task to the Level 0 run queue of the available processor; b) if the selected task cannot be affined at Level 0 and is not currently being run by another processor and is not currently in the Level 1 run queue of the available processor, moving the selected task to the Level 1 run queue of the available processor; and c) if the selected task is currently being run by a processor that is not a member of the same subset of processors to which the available processor belongs, moving the selected task to the Level 2 run queue.
- a plurality of secondary cache memories, each secondary cache memory being accessible to a subset of the processors;
-
32. In a multiprocessor system having a shared memory accessible to all processors;
- a plurality of secondary cache memories, each secondary cache memory being accessible to a subset of the processors;
a plurality of Level 0 run queues, each Level 0 run queue being associated with one processor and containing the processing tasks affined with that CPU;
a plurality of Level 1 run queues, each Level 1 run queue being associated with a subset of processors and containing the processing tasks affined with that subset of processors;
a Level 2 run queue associated with all processors and containing the processing tasks affined with all processors in the system; and
wherein each task is associated with a run queue indicator indicating the minimum run queue level at which that task may be affined;
a method of determining if a task in the Level 2 run queue should be moved to another run queue in the system when the task is selected for running by an available processor, the method comprising the steps of;a) if the selected task can be affined at Level 0 and is not currently being run by another processor, moving the selected task to the Level 0 run queue of the available processor; b) if the selected task can be affined at Level 1 and cannot be affined at Level 0 and is not currently being run by another processor, moving the selected task to the Level 1 run queue of the available processor; and c) if the selected task can be affined at Level 1 and is currently being run by one or more other processors and all other processors currently running the task are members of the same subset of processors to which the available processor belongs, moving the selected task to the Level 1 run queue of the available processor.
- a plurality of secondary cache memories, each secondary cache memory being accessible to a subset of the processors;
Specification