PERFORMANCE OF COPROCESSOR ASSISTED MEMSET() THROUGH HETEROGENEOUS COMPUTING
First Claim
1. A method comprising:
- receiving a request to fill a plurality of ranges of memory addresses with a value;
selecting a first subset of said plurality of ranges;
distributing said first subset of ranges to a plurality of coprocessors;
after said distributing said first subset, selecting a second subset of said plurality of ranges, wherein said second subset and said first subset are disjoint;
distributing said second subset of ranges to said plurality of coprocessors.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques herein perform coprocessor assisted memory filling in a pipeline. A computer receives a request to fill multiple ranges of memory addresses with a value. The computer selects a first subset of the multiple ranges and distributes the first subset of ranges to multiple coprocessors. The coprocessors begin to fill the memory locations of the first subset of ranges with the value. At the same time as the coprocessors fill the first subset of ranges, the computer selects a second subset of the multiple ranges of memory addresses. Also while the coprocessors are still filling the first subset of ranges, the computer distributes the second subset of ranges to the coprocessors This overlapping activity achieves a processing pipeline that can be extended for any amount of additional subsets of memory ranges.
14 Citations
21 Claims
-
1. A method comprising:
-
receiving a request to fill a plurality of ranges of memory addresses with a value; selecting a first subset of said plurality of ranges; distributing said first subset of ranges to a plurality of coprocessors; after said distributing said first subset, selecting a second subset of said plurality of ranges, wherein said second subset and said first subset are disjoint; distributing said second subset of ranges to said plurality of coprocessors. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. One or more non-transitory computer readable media storing instructions that include:
-
first instructions which, when executed by one or more processors, cause receiving a request to fill a plurality of ranges of memory addresses with a value; second instructions which, when executed by one or more processors, cause selecting a first subset of said plurality of ranges; third instructions which, when executed by one or more processors, cause distributing said first subset of ranges to a plurality of coprocessors; fourth instructions which, when executed by one or more processors, cause after said distributing said first subset, selecting a second subset of said plurality of ranges, wherein said second subset and said first subset are disjoint; fifth instructions which, when executed by one or more processors, cause distributing said second subset of ranges to said plurality of coprocessors. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A device comprising:
-
a plurality of coprocessors capable of storing a value at the memory addresses of a range of memory addresses; and a central processing unit (CPU) connected to said plurality of coprocessors and configured to; receive a request to fill a plurality of ranges of memory addresses with a value; select a first subset of said plurality of ranges; distribute said first subset of ranges to a plurality of coprocessors; after said distributing said first subset, select a second subset of said plurality of ranges, wherein said second subset and said first subset are disjoint; distribute said second subset of ranges to said plurality of coprocessors. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
Specification