Multi-threaded processing design in architecture with multiple co-processors
First Claim
1. A method for determining a mapping of a plurality of processing stages to a plurality of processors, which offers overall low average processing time to process input data, in a multimedia codec architecture, comprising the steps of:
- identifying a chronological sequence of processing stages for processing said input data including identifying interdependencies of said processing stages;
identifying a set of processing stages from said chronological sequence of processing stages for each of said processors based on suitability of each said processor to perform each of said processing stages;
mapping each said processing stage to a respective processor based on said set of processing stages identified for each processor;
staggering the processing stages for said mapping to accommodate said interdependencies resulting in a pipeline;
ascertaining average processing time for said pipeline associated with said mapping;
repeating the steps of mapping, staggering and ascertaining by changing mappings between the plurality of processing stages and the plurality of processors based on said set of processing stages identified for each processor to obtain a plurality of pipelines;
choosing one design pipeline from said plurality of pipelines that offers overall low average processing time based on each of said mappings; and
wherein said mapping of each said processing stage to a respective processor based on said set of processing stages identified for each processor comprises;
assigning one or more data buffers to each processing stage on each processor, based on said interdependencies; and
,ensuring that the step of mapping is done based on a constraint that no two processors can access a given buffer simultaneously.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for designing a multi-threaded processing operation that includes, e.g., multimedia encoding/decoding, uses an architecture having multiple processors and optional hardware accelerators. The method includes the steps of: identifying a desired chronological sequence of processing stages for processing input data including identifying interdependencies of said processing stages; allotting each said processing sage to a processor; staggering the processing to accommodate the interdependencies; selecting a processing operation based on said allotting to arrive at a subset of possible pipelines that offer low average processing time; and, choosing one design pipeline from said subset to result in overall timing reduction to complete said processing operation. The invention provides a multi-threaded processing pipeline that is applicable in a System-on-Chip (SoC) using a DSP and shared resources such as DMA controller and on-chip memory, for increasing the throughput. The invention also provides an article which is programmed to execute the method.
38 Citations
22 Claims
-
1. A method for determining a mapping of a plurality of processing stages to a plurality of processors, which offers overall low average processing time to process input data, in a multimedia codec architecture, comprising the steps of:
-
identifying a chronological sequence of processing stages for processing said input data including identifying interdependencies of said processing stages; identifying a set of processing stages from said chronological sequence of processing stages for each of said processors based on suitability of each said processor to perform each of said processing stages; mapping each said processing stage to a respective processor based on said set of processing stages identified for each processor; staggering the processing stages for said mapping to accommodate said interdependencies resulting in a pipeline; ascertaining average processing time for said pipeline associated with said mapping; repeating the steps of mapping, staggering and ascertaining by changing mappings between the plurality of processing stages and the plurality of processors based on said set of processing stages identified for each processor to obtain a plurality of pipelines; choosing one design pipeline from said plurality of pipelines that offers overall low average processing time based on each of said mappings; and wherein said mapping of each said processing stage to a respective processor based on said set of processing stages identified for each processor comprises; assigning one or more data buffers to each processing stage on each processor, based on said interdependencies; and
,ensuring that the step of mapping is done based on a constraint that no two processors can access a given buffer simultaneously. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16)
-
-
12. The method as in clam 10, further comprising creating predetermined sets of command sequences for programmable coprocessors, to reduce dynamic set up overhead costs.
-
17. A method for achieving efficient parallel processing multi-thread-design-capability in architecture which uses multiple processing units and is capable of handling multi-media encoding/decoding, comprising the steps of:
-
identifying a chronological sequence of processing stages for processing input data and their interdependencies during processing of said input data; identifying a set of processing stages from said chronological sequence of processing stages for each of said processing units based on suitability of each said processing unit to perform each of said processing stages; mapping each said processing stage to a respective processing unit based on said set of processing stages identified for each processing unit, wherein said mapping comprises; assigning a buffer, by mapping, to each processing unit for each processing stage; and ensuring that no two processing units can simultaneously access a given buffer; staggering said processing stages for said mapping to accommodate said interdependencies resulting in a pipeline; ascertaining average processing time needed, on respective mapped processing units, said pipeline; repeating the steps of mapping, staggering and ascertaining by changing mappings between the plurality of processing stages and the plurality of processing units based on said set of processing stages identified for each processing unit to obtain a plurality of pipelines; and
,from the plurality of pipelines, selecting a single design pipeline that offers overall low average processing time based on each of said mappings. - View Dependent Claims (18, 19, 20)
-
-
21. An article comprising a computer-readable media having instructions that when executed by a computing platform result in execution of a method for determining a mapping of a plurality of processing stages to a plurality of processors, which offers overall low average processing time to process input data, in a multimedia codec architecture, comprising the steps of:
-
identifying a chronological sequence of processing stages for processing said input data including identifying interdependencies of said processing stages; identifying a set of processing stages from said chronological sequence of processing stages for each of said processors based on suitability of each said processor to perform each of said processing stages; mapping each said processing stage to a respective processor based on said set of processing stages identified for each processor; staggering the processing stages for said mapping to accommodate said interdependencies resulting in a pipeline; ascertaining average processing time for said pipeline associated with said mapping; repeating the steps of mapping, staggering and ascertaining by changing mappings between the plurality of processing stages and the plurality of processors based on said set of processing stages identified for each processor to obtain a plurality of pipelines; and
,choosing one design pipeline from said plurality of pipelines that offers overall low average processing time based on each of said mappings; and wherein said mapping each said processing stage to a respective processor comprises; assigning one or more data buffers to each processing stage on each processor, based on said interdependencies; and
,ensuring that the step of allocating mapping is done based on a constraint that no two processors can access a given buffer simultaneously.
-
-
22. An article comprising a computer-readable media having instructions that when executed by a computing platform result in execution of a method for achieving efficient parallel processing multi-thread-design-capability in architecture which uses multiple processing units and is capable of handling multi-media encoding/decoding, comprising the steps of:
-
identifying a chronological sequence of processing stages for processing input data including identifying interdependencies of said processing stages; identifying a set of processing stages from said chronological sequence of processing stages for each of said processing units based on the suitability of each said processing unit to perform each of said processing stages; mapping each said processing stage to a respective processing unit based on said set of processing stages identified for each processing unit, wherein said mapping comprises; assigning a buffer, by mapping, to each processing unit for each processing stage; and ensuring that no two processing units can simultaneously access a given buffer; staggering the processing stages for said mappings to accommodate said interdependencies resulting in a pipeline; ascertaining average processing time needed, on respective mapped processing units, for said pipeline; repeating the steps of mapping, staggering and ascertaining by changing mappings between the plurality of processing stages and the plurality of processing units based on said set of processing stages identified for each processing unit to obtain a plurality of pipelines; and
,choosing one design pipeline from said plurality of pipelines that offers overall low average processing time based on each of said mappings.
-
Specification