Concurrent program execution optimization
DCFirst Claim
Patent Images
1. A system for processing a set of computer program instances, comprising:
- a plurality of processing stages, at least one of the plurality of processing stages comprising multiple processing cores, wherein,each given task of a plurality of tasks of a given program instance of the set of program instances is hosted at a different stage of the plurality of processing stages as a local task of the given program instance at the respective stage, andfor at least one of the multiple processing cores of a given processing stage of the plurality of processing stages, a local task of one of the program instances is assigned as an active task instance for execution for a period of time; and
a group of multiplexers each connecting inter-task communications (ITC) data to a respective stage of the plurality of processing stages, wherein at least one multiplexer of the group of multiplexers is a hardware resource dedicated to the local task, whereinthe at least one multiplexer is configured to connect ITC data to any processing core of the multiple processing cores to which the local task is assigned for execution for the period of time.
1 Assignment
Litigations
1 Petition
Accused Products
Abstract
An architecture for a load-balanced groups of multi-stage manycore processors shared dynamically among a set of software applications, with capabilities for destination task defined intra-application prioritization of inter-task communications (ITC), for architecture-based ITC performance isolation between the applications, as well as for prioritizing application task instances for execution on cores of manycore processors based at least in part on which of the task instances have available for them the input data, such as ITC data, that they need for executing.
-
Citations
25 Claims
-
1. A system for processing a set of computer program instances, comprising:
-
a plurality of processing stages, at least one of the plurality of processing stages comprising multiple processing cores, wherein, each given task of a plurality of tasks of a given program instance of the set of program instances is hosted at a different stage of the plurality of processing stages as a local task of the given program instance at the respective stage, and for at least one of the multiple processing cores of a given processing stage of the plurality of processing stages, a local task of one of the program instances is assigned as an active task instance for execution for a period of time; and a group of multiplexers each connecting inter-task communications (ITC) data to a respective stage of the plurality of processing stages, wherein at least one multiplexer of the group of multiplexers is a hardware resource dedicated to the local task, wherein the at least one multiplexer is configured to connect ITC data to any processing core of the multiple processing cores to which the local task is assigned for execution for the period of time. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method for managing execution of a plurality of instances of a program on a parallel processing system comprising an array of processor cores and a control system, the method comprising:
-
classifying the plurality of instances into a set of priority classes based at least in part on determining, for each instance of the plurality of instances, (i) whether the respective instance is waiting for arrival of input data at one or more input buffers of the respective instance, and (ii) whether the respective instance is waiting for completion of memory content transfers to update one or more fast-access memories of the respective instance; and selecting a subset of the plurality of instances for execution on a subset of the array of processor cores based at least in part based on the classifying, wherein at least one of the classifying and the selecting is performed by the control system. - View Dependent Claims (9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A control system for an array of processing cores shared among a set of software programs, the control system comprising:
-
a plurality of fast-access memories, wherein each memory of the plurality of fast-access memories is a hardware resource constantly dedicated to an associated instance of a plurality of instances of the set of software programs; a subsystem for updating contents of one or more of the plurality of fast-access memories, and indicating whether contents of any given memory of the plurality of fast-access memories are updated, wherein indicating that the contents of a respective memory of the plurality of fast-access memories are updated comprises indicating that the contents are updated when the contents are ready for execution of the associated instance; and a controller for allocating the array of processing cores among the set of software programs, for execution of one or more instances of each of the set of software programs, wherein allocating comprises allocating based at least in part on a respective number of instances of each program of the set of software programs having a respective dedicated fast-access memory indicated as updated. - View Dependent Claims (18, 19, 20)
-
-
21. A system for managing execution of a plurality of instances of a program on an array of processor cores, the system comprising:
-
a subsystem for classifying the plurality of instances into a set of priority classes based at least in part on determining, for each instance of the plurality of instances, (i) whether the respective instance is waiting for arrival of input data at one or more input buffers of the respective instance, and (ii) whether the respective instance is waiting for completion of memory content transfers to update one or more fast-access memories of the respective instance; and a subsystem for selecting a subset of the plurality of instances for execution on a subset of the array of processor cores based at least in part on the classifying; wherein at least one of the subsystem for classifying and the subsystem for selecting comprises digital hardware logic. - View Dependent Claims (22, 23, 24, 25)
-
Specification