Scheduling of Multiple Tasks in a System Including Multiple Computing Elements
First Claim
1. In a system including:
- a central processing unit (CPU) operatively attached to and accessing a system memory; and
a plurality of computing elements, wherein the computing elements each include a computational core, local memory, and a local direct memory access (DMA) unit, wherein the local memory and the system memory are accessible by the computational core using the local DMA unit, a method comprising the steps of;
(a) storing by the CPU in the system memory a plurality of task queues in one-to-one correspondence with the computing elements, wherein each of said task queues includes a plurality of task descriptors which specify a sequence of tasks for execution by the computing elements;
(b) upon programming said computing element with task queue information of said task queue, accessing the task descriptors of said task queue in the system memory;
(c) storing said task descriptors of the task queue in local memory of the computing element;
wherein said accessing and said storing are performed using the local DMA unit of the computing element;
(d) executing the tasks of the task queue by the corresponding computing element, wherein said executing of the respective task queues is performed in parallel by at least two of said computing element; and
(e) interrupting respectively the CPU by the computing elements only upon fully executing all the tasks of the respective task queue.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for controlling parallel process flow in a system including a central processing unit (CPU) attached to and accessing system memory, and multiple computing elements. The computing elements (CEs) each include a computational core, local memory and a local direct memory access (DMA) unit. The CPU stores in the system memory multiple task queues in a one-to-one correspondence with the computing elements. Each task queue, which includes multiple task descriptors, specifies a sequence of tasks for execution by the corresponding computing element. Upon programming the computing element with task queue information of the task queue, the task descriptors of the task queue in system memory are accessed. The task descriptors of the task queue are stored in the local memory of the computing element. The accessing and the storing of the data by the CEs is performed using the local DMA unit. When the tasks of the task queue are executed by the computing element, the execution is typically performed in parallel by at least two of the computing elements. The CPU is interrupted respectively by the computing elements only upon their fully executing the tasks of their respective task queues.
54 Citations
15 Claims
-
1. In a system including:
-
a central processing unit (CPU) operatively attached to and accessing a system memory; and a plurality of computing elements, wherein the computing elements each include a computational core, local memory, and a local direct memory access (DMA) unit, wherein the local memory and the system memory are accessible by the computational core using the local DMA unit, a method comprising the steps of; (a) storing by the CPU in the system memory a plurality of task queues in one-to-one correspondence with the computing elements, wherein each of said task queues includes a plurality of task descriptors which specify a sequence of tasks for execution by the computing elements; (b) upon programming said computing element with task queue information of said task queue, accessing the task descriptors of said task queue in the system memory; (c) storing said task descriptors of the task queue in local memory of the computing element;
wherein said accessing and said storing are performed using the local DMA unit of the computing element;(d) executing the tasks of the task queue by the corresponding computing element, wherein said executing of the respective task queues is performed in parallel by at least two of said computing element; and (e) interrupting respectively the CPU by the computing elements only upon fully executing all the tasks of the respective task queue. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A system comprising:
-
(a) a central processing unit (CPU); (b) a system memory operatively attached to and accessed by said CPU; and (c) a plurality of computing elements, wherein said computing elements each include a computational core, local memory, and a local direct memory access (DMA) unit, wherein said local memory and said system memory are accessible by said computational core using said local DMA units, wherein said CPU stores in said system memory a plurality of task queues in one-to-one correspondence with said computing elements, wherein each task queue includes a plurality of task descriptors which specify a sequence of tasks for execution by said computing element, wherein upon programming said computing element with task queue information of said task queue, said task descriptors of said task queue are accessed in system memory using said local DMA unit of said computing element, wherein said task descriptors of said task queue are stored in local memory of said computing element using said local DMA unit of said computing element, wherein said tasks of said task queue are executed by said computing element and at least two of said computing elements process respective task queues in parallel, and wherein said CPU is interrupted by said computing elements only upon fully executing said tasks of said respective task queue. - View Dependent Claims (8, 9)
-
-
10. An image processing system for processing in real time multiple image frames, the system comprising:
-
(a) a central processing unit (CPU); (b) a system memory operatively attached to and accessed by said CPU; and (c) a plurality of computing elements, wherein said computing elements each include a computational core, local memory, and a local direct memory access (DMA) unit, wherein said local memory and said system memory are accessible by said computational core using said local DMA unit, wherein said CPU stores in said system memory a plurality of task queues in one-to-one correspondence with said computing elements, wherein each task queue includes a plurality of task descriptors which specify a sequence of tasks for execution by said computing element, wherein upon programming said computing element with task queue information of said task queue, said task descriptors of said task queue are accessed in system memory using said local DMA unit of said computing element, wherein said task descriptors of said task queue are stored in local memory of said computing element using said local DMA unit of said computing element, wherein said tasks of said task queue are executed by said computing element and at least two of said computational cores process respective task queues in parallel, wherein said CPU is interrupted by said computing elements only upon fully executing said tasks of said respective task queue, wherein at least one of the computing elements is programmed to classify an image portion of one of the image frames as an image of a known object, and wherein another of the computing elements is programmed to track said image portion in real time from the previous image frame to the present. image frame. - View Dependent Claims (11, 12, 13, 14, 15)
-
Specification